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Abstract — The problem of serving multicast flows in a crossbar 
switch is considered. Intra-flow linear network coding is shown to 
achieve a larger rate region than the case without coding. A traffic 
pattern is presented which is achievable with coding but requires 
a switch speedup when coding is not allowed. The rate region with 
coding can be characterized in a simple graph-theoretic manner, 
in terms of the stable set polytope of the "enhanced conflict 
graph". No such graph-theoretic characterization is known for 
the case of fanout splitting without coding. 

The minimum speedup needed to achieve 100% throughput 
with coding is shown to be upper bounded by the imperfection 
ratio of the enhanced conflict graph. When applied to K x N 
switches with unicasts and broadcasts only, this gives a bound 
of min(2^pi, -^^) on the speedup. This shows that speedup, 
which is usually implemented in hardware, can often be substi- 
tuted by network coding, which can be done in software. 

Computing an offline schedule (using prior knowledge of the 
flow rates) is reduced to fractional weighted graph coloring. A 
graph-theoretic online scheduling algorithm (using only queue 
occupancy information) is also proposed, that stabilizes the 
queues for all rates within the rate region. 

Index Terms — Network coding, multicast switch, scheduling, 
speedup, rate region, imperfection ratio. 



I. Introduction 

NETWORK information flow is a field of information 
theory which aims to quantify the maximum information 
flow through a network. The network information flow prob- 
lem is closely related to the multi-commodity flow problem 
and has been studied extensively owing to its wide applications 
in communication networks. 

An information network is represented by a directed graph 
N = {V, E) where (i, j) G if there is a communication link 
from node i e y to node j G V . Each link is associated with 
a capacity, and we assume that the link is error-free as long as 
the rate is below this capacity. There are two special subsets S 
and T of V . The set S is the set of sources, which generates 
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mutually independent streams of information or messages. The 
set T is the collection of sinks. Each sink node i £ T requires 
some subset of the information streams from the source nodes. 
This is called the multicast requirement. 

The main question in network information flow is - given 
a network N — (V, E) and a multicast requirement, is it 
possible to satisfy all the sink nodes without violating the 
capacity constraints? Before the notion of network coding was 
introduced, researchers focused on answering this question in 
a router network. A router network is a network where each 
packet that enters a node can only be routed or relayed onto 
some outgoing link(s). In other words, the intermediate nodes 
in the network cannot modify the packets that they receive 
- they can only forward the packets. However, Ahlswede et 
al. [2 1 introduced the notion of network coding, which allows 
mixing of data at intermediate network nodes. Section Ill-AI 
provides a brief overview of network coding. 

In this paper, we study the benefit obtained from using 
network coding in a special type of network - the multicast 
crossbar switch (see Section ITl-B I for background). A crossbar 
switch is a network of depth one - it consists of source 
nodes or the inputs and sink nodes or the outputs, with every 
input being directly connected to every output. A crossbar 
switch with K inputs and N outputs has a K x N matrix of 
intersections where the inputs and outputs "cross" as shown 
in Figure [T] It can be arranged to have what we call the 
intrinsic multicast capability - an input can convey a packet 
to several outputs at the same time, by simply connecting 
the input line to the corresponding output lines. However, an 
input cannot convey different packets to different outputs at 
once. The crossbar switch is one of the principal architectures 
used to construct bigger switches. It is widely used in infor- 
mation processing applications such as telephony and packet 
switching - thus, making it an important component of the 
communication networks, in particular the Internet. 
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Fig. 1. A diagram of an X X input-queued crossbar switch 
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We will focus on input-queued crossbar switches. An input- 
queued switch is one which has queues at each input to store 
incoming packets before they are processed by the switching 
fabric. All input and output Unes are assumed to have the same 
capacity called the line rate. A traffic pattern for which the total 
rate of flows traversing each input or output is no more than the 
line rate, is said to be admissible. A traffic pattern which can 
be served without causing the queues to grow unboundedly, is 
said to be sustainable or achievable. Note that admissibility 
is a necessary condition for a traffic pattern to be sustainable. 
Indeed, if the total rate of flows going into an output exceeds 
the outgoing line rate, it is physically not possible to keep the 
queues bounded. 

The input-queued crossbar switch has been studied ex- 
tensively, especially in the context of unicast traffic, where 
unicast means that for each stream of information, there is 
only one sink. A unicast traffic pattern is a set of information 
streams each of which is a unicast flow. It is known that 
every admissible unicast traffic pattern is also achievable 
EU, 1371 . In other words, as long as no input or output is 
oversubscribed, the queues can be stabilized, thereby achieving 
100% throughput. 

Unfortunately, this result does not extend to multicast flows, 
where a single stream of information from a source may be 
destined to reach more than one sink. The extension of the 
problem from unicast to multicast flows is thus intrinsically 
more difficult. Marsan et al. |26| showed that 100% throughput 
cannot be achieved for multicast flows in an input-queued 
switch. The authors gave a characterization of the rate region 
achievable in a multicast switch with fanout splitting and also 
defined the optimal scheduling policy. Fanout splitting is the 
ability to serve a multicast flow partially to only a subset of 
its destined outputs, and complete the service in subsequent 
slots - see Section Hl-Cl I for more details. 

Switches have a feature called speedup which allows them 
to process packets faster than the input or output line rate. 
This feature is usually implemented using parallelization of 
hardware |29|. A formal definition of speedup can be found 
in Section III-C3I In |26|, the minimum speedup needed to 
achieve 100% throughput is shown to grow unboundedly 
with the switch size for multicast traffic. It is not hard to 
observe that with enough speedup, a switch can achieve any 
admissible traffic pattern; however, as it is the case with most 
hardware features, speedup is expensive to implement and hard 
to change once the switch is installed. 

Another means of increasing throughput in a switch is 
network coding. We study input-queued switches that are 
loaded with both unicast and multicast traffic, where inputs 
are allowed to perform network coding. In this paper, we 
consider a specific type of network coding - linear intra- 
flow network coding for its simplicity and optimality. Note 
that network coding may be implemented in software, which 
makes it preferable to speedup as a way to increase the switch 
throughput. For further details on network coding, see Section 
III-AI We ask the question - what is the magnitude of the 
benefit we obtain from using network coding in multicast 
switches? Can we replace speedup with network coding? If 
not completely, then by how much? Can we use the insight 



we gain here to design scheduling algorithms for multicast 
switches with network coding? The main contributions of this 
paper are: 

1) We prove that linear network coding increases the 
achievable rate region of a multicast switch. In Sec- 
tion IV-EI we present an example traffic pattern that 
demonstrates how network coding increases a switch 
throughput. In addition, in Section [VI-DI we show that 
network coding allows the switch to be robust to heavy 
traffic load, resulting in smaller delay compared to 
fanout-splitting. 

2) We propose a graph-theoretic representation of any 
traffic pattern in terms of what we call the enhanced 
conflict graph and provide a simple graph-theoretic 
characterization of the multicast switch rate region with 
coding in Section |lll] We prove that the achievable 
rate region of a network coding multicast switch is a 
projection of the stable set poly tope of the enhanced 
conflict graph of the traffic pattern. 

3) We show that network coding can in many cases sub- 
stitute for speedup. In Section [V] we prove our main 
result (Theorem |9l) which relates the imperfection ratio 
of the enhanced conflict graph and the speedup needed 
to achieve all admissible rates. Using our main result, 
we provide a lower bound and a graph-theoretic upper 
bound on the minimum speedup needed to achieve 
100% throughput. In particular, for a K x N switch 
with traffic pattern restricted to unicasts and broadcasts 
only, we show that the minimum speedup is at most 
min iItt) ' ™^ result when appHed to a 2 x iV 
switch, gives a bound of 1.5 on speedup; however, we 
conjecture that the actual speedup required to achieve 
100% throughput in a K x N switch with traffic pat- 
terns consisting of unicasts and broadcasts only is 1.25 
(Conjecture |2] in Section [V-Fb . 

4) In Section IVll we discuss offline and online scheduling 
algorithms for a multicast switch to achieve the rate 
region while stabiUzing the queues. 

As mentioned earlier, for the case of fanout splitting without 
coding, 1 26 1 gave a characterization of the rate region as the 
convex hull of certain modified departure vectors. However, 
a graph-theoretic formulation of the same is not known. On 
the other hand, for the case with coding, our graph-theoretic 
formulation helps us understand the effect of the traffic pattern 
on the throughput. The properties of the enhanced conflict 
graph can be used to derive insight on what kind of traffic 
patterns are "difficult" in terms of computing the schedule, 
and in terms of achieving 100% throughput. 

This paper is organized as follows. Section HI] presents the 
background and preliminary definitions that will be used in 
the rest of the paper This paper mainly draws ideas from 
network coding (Section FlI-AI ) and graph theory (Section FlI-DI) . 
Section |lll] discusses the benefits of network coding when 
applied to multicast switches. In particular, we present a graph- 
theoretic formulation of network coding in Sections IIII-AI and 
IIII-BI Section [V] gives the relationship between speedup and 
imperfection ratio of the enhanced conflict graph, which leads 
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to our main result - an upper bound on the minimum speedup 
required to achieve 100% throughput in a multicast switch with 
coding. In Section [Vll we use the graph-theoretic formulation 
of network coding to propose offline and online algorithms for 
scheduling of a multicast switch. Finally, in Section IVIII we 
summarize the contributions of this paper and discuss potential 
avenues for future work. 

II. Preliminaries 

This section gives an overview of the relevant work in the 
area of network coding (Section III-Ab . multicast switching 
theory (Section III-BI and Ill-Cb . and graph theory (Section 

A. Network coding 

Reference |i2J showed that coding within a network allows 
a source to multicast information at a rate approaching the 
smallest cut between the source and any receiver, as the coding 
field size approaches infinity. Li, Yeung and Cai [24] showed 
that any solvable network with one source and multiple sinks 
(called multicast network) has a scalar linear solution over a 
sufficiently large finite field alphabet. In addition, [24J showed 
that in multicast networks, linear coding suffices to achieve 
the optimum, which is the max-flow from the source to each 
sink. Subsequently, Kotter and Medard [22J showed that in 
the general network coding problem, deciding achievability 
and solvability is equivalent to deciding whether a certain 
algebraic variety is empty or not. Noting the potential of linear 
network coding, they presented an algebraic framework for 
linear network coding in arbitrary networks and showed that 
a simple linear code is sufficient to achieve capacity in the 
multicast problem. 

As a result, there has been a great emphasis on linear 
network coding. For instance. Ho et al. 1 16 | proposed a simple, 
practical capacity-achieving code. They proposed that every 
node construct its linear code randomly and independently 
from all other nodes. This simple construction was shown to 
achieve capacity with probability exponentially approaching 1 
with the field size. Medard et al. f28l conjectured that every 
solvable network has a linear solution over some finite field 
alphabet and vector dimensions. However, Dougherty et al. 
(HI provided a counterexample non-multicast network which 
is not solvable with linear coding. Although [81 proved that 
linear network coding is not sufficient for general networks, 
linear network coding nevertheless is still a powerful tool. 
In particular, if only intra-session coding is allowed, linear 
network coding suffices for networks with multiple multicast 
sessions, including multicast switches. Linear intra-session 
coding for multiple multicast networks was studied in iflTl . 
In our paper, we only allow intra-flow coding, i.e., packets 
are coded together only if they have the same source and 
destination set. Therefore, we shall only consider linear codes. 

B. Multicast switch model 

Multicast switches can be thought of as simple information 
networks where there are only sources and sinks, no interme- 
diate nodes. Each source is connected to all sinks. In the most 



basic model, a switch acts as a router We will now formally 
specify the switch model used in this paper 

A K X N switch consists of K sources or inputs and N 
sinks or outputs. Packets arrive at inputs on input lines, and 
depart from outputs on output lines. All input and output lines 
are assumed to have the same capacity called the line rate. We 
consider a slotted time system, where the length of the slot 
is chosen to be the reciprocal of the line rate. Henceforth, 
all rates will be normalized with respect to the line rate, and 
will expressed in packets per slot. All packets are assumed 
to be of the same size. The speed of the switch fabric is 
assumed to be such that if it connects an input to an output, 
it can transfer one packet over this connection, in one slot. 
This corresponds to a speedup of 1 (Speedup is defined in 
Section Ill-Cb . Arrivals may occur any time during a slot. All 
transmissions are assumed to begin just after the beginning of 
a slot and end just before the end of the same slot. The switch 
configuration may change only at slot boundaries. 

Definition 1 (Rate): A rate specifies the average number 
of packets that needs to be transferred from an input to the 
outputs per slot. A rate of 1/2, for example, means that on 
average the input has to send one packet over two slots. 

Definition 2 (Flow): A flow is the stream of all packets that 
have a given input and a given destination set. Thus, a flow 
is specified by a 2-tuple (i, J) consisting of the input i and 
a set J of outputs corresponding to the destination set of the 
multicast stream. This set J of outputs is called ihtfanout set. 
Sometimes, we denote a flow by a 3-tuple, (r, i, J) where r is 
the rate of the flow. For example, in a 2 x 3 switch, we could 
have a flow / = (1/2, 1, {1, 2}) which is a stream of packets 
from input 1 to outputs 1 and 2 with a rate of 1/2. 

Definition 3 (Subflow): A subflow of flow (i, J) is the part 
of a flow from input i that goes to a particular output j in J. 
Therefore, a subflow is specified by a 3-tuple (i, J, j) consist- 
ing of the input i, the fanout J and one output j G J. The 
rate of a subflow is defined to be the rate of the flow to which 
it belongs. Sometimes, we denote the subflow by a 4-tuple 
(r, i, J, j), where r is the rate of the subflow. For instance, a 
flow / = (1/2, 1, {1,2}) has two subflows associated with it: 
/i = (1/2, 1, {1, 2}, 1) and h = (1/2, 1, {1, 2}, 2). 

The constraints on the switch configuration are specified 
below: 

• An input may send the same packet to many outputs at 
once, but may not send different packets to different out- 
puts simultaneously. This is called the intrinsic multicast 
capability. 

> An output may receive a packet from only one input at 
a time. 

These constraints give rise to the need for queues at the 
inputs as multiple packets may arrive at an input simultane- 
ously. Each input maintains a separate queue for each flow. 
Therefore, if we have every possible flow through an input, 
then the input needs to maintain a set of 2^ — 1 queues; 
otherwise, fewer queues will suffice. The queues are assumed 
to have infinite capacity, but the goal of the scheduling 
algorithms will be to keep their occupancy stable. A diagram 
of a -ftT X input-queued multicast switch is given in Figure 
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Definition 4 (Traffic Pattern ): A traffic pattern is a collec- 
tion of flows. A traffic pattern is called admissible if the sum of 
the rates of all the flows through each input or output does not 
exceed one, i.e., the inputs and outputs are not oversubscribed. 
A traffic pattern is said to be achievable if there exists a switch 
schedule that can serve it, while keeping the queues stable. 

C. Scheduling strategies 

Clearly, admissibility is a necessary condition for a traffic 
pattern to be achievable; however, it is not clear whether the 
converse holds. It turns out that the converse is true for unicast 
traffic 1 27 1, but not for multicast traffic 1261 . 

For unicast traffic, Chang et al. lH presented a scheme, 
called the Birkhoff-von Neumann switch, that not only 
achieves 100% throughput but also guarantees packet delay in 
offline settings. The Birkhoff-von Neumann switch is based 
on a theorem that says any doubly stochastic matrix can be 
expressed as a convex combination of permutation matrices 
[4||38|. Note that, any admissible unicast traffic pattern can 
be converted to a doubly stochastic matrix. Then, the doubly 
stochastic matrix is decomposed into permutation matrices, 
which in turn correspond to switch states. 

Sundararajan et al. Il32l extended this Birkhoff-von Neu- 
mann approach to multicast switching. Using a graph-theoretic 
formulation, they showed that the rate region of multicast 
switching without fanout splitting (defined in Section ITl-C II ) is 
precisely the stable set polytope of the traffic pattern's "conflict 
graph", which we shall discuss in Section IIII-AI As a result, 
they showed that the problem of deciding achievability in 
a multicast switch is equivalent to the membership problem 
for the stable set polytope of a graph, which is known to 
be A^P-hard. In addition, |[32l showed that computing the 
offline schedule for multicast traffic, unlike that for unicast 
traffic, is hard. Indeed, it is equivalent to fractional weighted 
graph coloring, which is iVP-hard in general. Thus, many 
of the complexity and achievability results for unicast traffic 
do not extend to multicast traffic. Even if a traffic pattern is 
admissible, depending on the switch's capabilities, the switch 
may not be able to achieve the traffic pattern. 

Example 1: Consider the traffic pattern shown in Figure |3] 
This traffic pattern consists of a broadcast flow (1/2,1, {1,2}), 



and two unicast flows (1/2, 2, {1}) and (1/2, 2, {2}). This 
traffic pattern shown in Figure |3] is admissible since every 
input and output has a total rate of at most 1 . However, if the 
switch is restricted to serve the broadcast flow to all outputs 
at once, i.e., it is not allowed to split the fanout, then at most 
one of the three flows can be served at a time. In this case, 
the sum of the rates of the three flows must be less than 1 
to be achievable; however, the sum of the rates is 3/2. The 
broadcast from input 1 at rate 1/2 requires half of the time. 
During this time, input 2 cannot serve the two unicasts. But 
that leaves input 2 with the remaining half of the slots to serve 
two unicasts at rate 1/2 each, which is not possible. Therefore, 
this traffic pattern is not achievable. 
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Fig. 3. Admissible but not-achievable traffic pattern 

This observation that not all admissible traffic patterns are 
achievable raises the question of how much of the admissible 
rate region is actually achievable. To achieve those admissible 
but not achievable traffic patterns, what additional capabilities 
does a switch require? What capability of a switch is the most 
effective in increasing the achievable rate region to be at least 
the admissible rate region? In Sections [II-C 1 1 lirC2l and ITl-CSI 
we present three approaches - fanout splitting, linear network 
coding, and speedup - to increase the rate region of a switch. 

1) Fanout splitting: There are many ways in which a 
multicast switch can serve a multicast flow. The most simple 
method would be to serve all the multicast flow as if it was 
multiple unicast flows. For example, the packets of / = 
(1/2, 1, {1, 2}) could be "copied" into two separate unicasts 
/i = (1/2,1,{1}) and = (1/2,1, {2}). This scheme is 
inefficient because, in some cases, it converts an originally 
achievable traffic pattern into one that is inadmissible. For 
example, copying /' = (1/2, 1, {1, 2, 3}) into three unicasts 
will make three flows with rate 1/2 which overbooks input 1. 
The other extreme is to force the input to send the multicast 
packet to every output node in the fanout set simultaneously, 
which was described as the no-splitting strategy in |[T4ll . 
However, this scheme can be restricting, as shown by the 
example in Figure [3j 
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Fig. 4. A traffic pattern that shows the benefit of coding 
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Input 1 sends the same packet 




time 1 time 2 

Fig. 5. A traffic pattern wliicli demonstrates tlie benefit of fanout-splitting 

The middle ground between copying and no-splitting is 
fanout-splitting fTT]. Fanout-splitting allows the source to 
serve subsets of the fanout set at different points in time. 
Therefore, copying and no-splitting are two extreme cases of 
fanout-splitting: the first serves the fanout set by dividing it 
into subsets of size one, the latter serves it by not splitting 
at all. By definition, fanout-splitting achieves a greater rate 
region than copying or no-splitting. 

Example 2: The pattern in Figure [3] cannot be satisfied by 
a no-splitting strategy, but with fanout-splitting this traffic 
pattern can be achieved as shown in Figure |5] In Figure |5] we 
can see that input 1 completes the broadcast over two slots 
using fanout-splitting, while input 2 serves unicasts to the idle 
outputs over the same two slots. 

However, even with fanout-splitting, some admissible traffic 
patterns are not achievable. Figure]?] gives an example of such 
a case. 

Example 3: The traffic pattern in Figure ]4] is very similar 
to that in Figure ]3] however, with one more output. In order 
for input 2 to complete all three unicasts, input 2 needs to be 
serving one of the unicasts at all times. As a result, in each 
slot, input I can partially serve its broadcast packet to at most 
two idle outputs. Therefore, to serve each broadcast packet 
completely, input 1 requires two slots. This implies that input 
1 can serve the broadcast flow at rate at most 1/2, even if it 
is allowed to use fanout-splitting; however, the traffic pattern 
shown in Figure]?] requires a broadcast rate of 2/3. 

2) Linear network coding: In this paper, we consider a 
model where the switch, in addition to fanout splitting, is 
allowed to perform linear intra-flow coding, i.e., inputs can 
now code across packets from the same flow. In the rest of 
this paper, network coding means linear intra-flow coding. 
The benefit of network coding can be seen in Figure ]4] It 
illustrates a schedule that achieves the traffic pattern which 
we showed cannot be achieved using just fanout-splitting. It 
is important to note that linear network coding requires fanout- 
splitting. If fanout-splitting is not allowed, there is no benefit 
of coding since just routing would suffice. This example shows 
that the network coding rate region is greater than that of 
fanout-splitting. 

However, not all admissible rates are achievable even with 
network coding. For instance. Figure ]6] shows a traffic pattern 
which is admissible but not achievable even when network 
coding is allowed. This is because input 2 is fully loaded and 
thus, needs to serve one of the two unicasts in every slot. As 
a result, in any slot, input 1 can serve packets to only two 



outputs. Input 1, thus, requires two slots to serve one packet 
from its broadcast. Since the broadcast requires a rate of 1/2, 
input 1 has to serve the broadcast at every time step, leaving 
no time for its unicast. 



Fig. 6. A traffic pattern wliicli cannot be acliieved by networlc coding 

This observation brings into attention the question of how 
much of the admissible rate region does network coding 
actually achieve? In Section ]III] we shall discuss in more 
detail such questions regarding the benefit of network coding 
in switches. 

Another class of linear network coding we could consider 
is inter-flow coding |9|. Inter-flow coding can encode packets 
from the same flow as well as packets from different flows that 
originate from the same input. It can be shown that inter-flow 
coding has a strictly larger rate region than that of intra-flow 
coding. However, inter-flow coding is not considered in this 
work. 

3) Speedup: Multicast traffic patterns such as the one in 
Figure ]6] cannot be sustained even with coding, although they 
are admissible. To achieve such rate points, the switch needs 
to provided with some additional capability such as speedup. 

Deflnition 5 (Speedup): A switch is said to have a speedup 
of s if the switching fabric can transfer s packets over one slot 
(as defined in Section III-Bl i from an input to an output. This 
means the switching fabric can go through s configurations 
within one slot. In other words, during the time it takes for 
a packet to arrive at the switch on average, the switch can 
change its configuration s times. 

It is important to note that with enough speedup, a switch 
can achieve any multicast traffic pattern even without fanout 
splitting. For example, in a A' x switch, if s > K then any 
admissible traffic pattern is achievable. Given any admissible 
traffic pattern, the switch can divide it up so that each of the 
K inputs is separately served. Therefore, as shown in Figure 
]?] the switch will serve whatever traffic input 1 needs to send, 
then input 2, 3, and so forth. Since the switch has speedup 
of s > K, the switch can internally process the K inputs 
separately and still satisfy all the multicast requirements. 

Therefore, a key question is what is the minimum speedup 
we need to achieve all admissible traffic patterns? From our 
example in Figure ]7] we know that we can upper bound the 
minimum speedup hy K in a K x N switch even without 
fanout-splitting or coding; however, can we find a better 
bound? In addition, as noted in Section ]III] we know that 
network coding increases throughput but not enough to cover 
the entire admissible rate region. However, we know that with 
enough speedup any admissible traffic pattern is achievable. 
Then, our next question is how much speedup does network 
coding replace? This question will be discussed in more detail 
in Section IV] 
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Fig. 7. Speedup of s = K for an K X N multicast switch 



D. Graph theory 

In this section, we present some preliminary definitions that 
will be used throughout this paper For more detailed and 
thorough survey on graph theory and combinatorics, see I.30J . 

Let G — (V, E) be an undirected graph with vertex set V 
and edge set E. A graph Gi = (Vi, Ei) is a subgraph of G if 
Vi C V and Ei C E. A graph G2 = (V2, S2) is an induced 
subgraph of G if V2 C 1/ and for all vi G V2 and V2 G V2, we 
have {vi,V2) G E2 if and only if {vi,V2) G E. In addition, 
G2 is often denoted as G(V2) and is said to be induced by 
V2. The complement of graph G denoted G, is a graph on the 
same vertex set V such that two vertices of G are adjacent if 
and only if they are not adjacent in G. 

Definition 6 (Chromatic Number): The chromatic number 
of a graph G is the smallest number of colors x{G) needed to 
color the vertices of G so that no two adjacent vertices share 
the same color. 

Definition 7 (Complete Graph): G is a complete graph if 
for every pair of vertices in V there exists an edge connecting 
the two. 

Definition 8 (Multipartite Graph): G is a multipartite 
graph if V can be partitioned into non-empty subsets, called 
partitions, such that no two vertices in the same partition have 
an edge connecting them. 

Definition 9 (Complete Multipartite Graph): G is a com- 
plete multipartite graph if G is a multipartite graph such that 
any two vertices that are not in the same partition have an 
edge connecting them. 

Definition 10 (Clique): In a graph G — {V,E), a set of 
vertices Vi C y is said to form a clique if these vertices 
induce a complete graph. 

Definition 11 (Clique Number): The clique number lo{G) 
of a graph G is the number of vertices of the largest clique in 
G. 

Definition 12 (Stable Set): In a graph G ~ {V, E), a set of 
vertices yi C y is said to form a stable set if for every pair 
of vertices in Vi, there is no edge connecting the two. 

Definition 13 (Fractional Weighted Coloring Problem): 
Given a graph G and a weight Wy G IR+ for each vertex, 
minimize (^i ^ I^"'^, Vi) such that there 

exist stable sets {Si} of G with ^iX^' ~ where w is 

the given weight vector, and denotes the incidence vector 
of the stable set S. The optimum value of the minimization 
problem is called the fractional weighted chromatic number. 

Definition 14 (Hole): G is a hole if it is a chordless cycle; 
G is called an odd hole if it is a hole of odd length at least 5. 



Definition 15 (Anti-hole): G is an anti-hole if its comple- 
ment is a hole; G is an odd anti-hole if its complement is an 
odd hole. 

Definition 16 (Perfect Graph): G is said to be perfect if 
for every induced subgraph of G, the size of the largest clique 
equals the chromatic number 

1) Stable set polytope: The stable set polytope STAB{G) 
of a graph G = (V, E) is the convex hull of the incidence 
vector^ X of the stable sets of the graph G. For a general 
graph G, it is iVP-hard to compute the stable set polytope 
STAB{G) and a complete characterization of STAB{G) in 
terms of linear inequalities is unknown. 

However, several families of necessary conditions are 
known. One example is the clique inequalities: 



E 



X, < 1 



(1) 



for all cliques Q in G. Clique inequalities of a graph say that 
the total weight on the vertices of maximal cliques must not 
exceed 1. Note that an incidence vector of a stable set must 
satisfy all the clique inequalities since a stable set can only 
have at most one vertex from each clique in a graph. Thus, 
this shows that the clique inequalities are necessary conditions 
for the stable set polytope. The polytope described by these 
clique inequalities along with non-negativity constraints 



(2) 



for all nodes « of G is called the fractional stable set polytope 
QSTAB{G). The fractional stable set polytope is often used 
as a canonical relaxation of STAB{G). Note that, for most 
graphs, STAB{G) C QSTAB{G), since the clique inequal- 
ities are necessary but not sufficient conditions for stable set 
polytope. The two polytopes coincide precisely when G is 
perfect. 

Another family of necessary conditions is the odd hole 
constraints [TJ: 

\H\ 



Xi < 



(3) 



where is a set of vertices that induce an odd hole in graph 
G, and \H\ denotes the cardinality of H. It is easily seen that 
the incidence vector of a stable set must satisfy the odd hole 
constraints since a stable set can only have at most one vertex 
from two adjacent vertices, and therefore, it can include only 
every other vertex in a cycle. 

2) Perfect graph: From the definitions in Section Bl-DI it 
is not hard to see that in any graph, the clique number is a 
lower bound on the chromatic number, since all vertices in a 
clique must be assigned a distinct color in any proper coloring. 
Perfect graphs are those for which this lower bound is tight 
for all its induced subgraphs. 

One of the important features of perfect graph is that many 
A^P-hard graph problems become easy to solve on perfect 
graphs. For example, the graph coloring problem, maximum 
clique problem, maximum stable set problems as well as the 

'The incidence vector of a set of vertices C V is a {0, l}-vector x 
whose entries aie labeled with the vertices of G. If Xi = 1, then vertex i is 
in Vi; otherwise, i (f:Vi. 



7 



Stable set polytope problems are all known to be solvable in 
polynomial time for perfect graphs f30l. In addition, perfect 
graphs lend us a complete characterization of STAB{G) in 
terms of linear inequalities: STAB{G) = QSTAB{G) if and 
only if G is perfect; thus STAB{G) is defined by the clique 
inequalities and the non-negativity constraints if and only if 
G is perfect. 

We now state three well-known theorems about perfect 
graphs, which can be found on page 1107 - 1111 of |30|. 

Theorem 1: (Weak Perfect Graph Theorem) A graph G is 
perfect if and only if its complement is perfect. 

Theorem 2: (Strong Perfect Graph Theorem) A graph G is 
perfect if and only if it contains no odd hole and no odd anti- 
hole. 

Lemma 1: (Replication Lemma) Let G — {V,E) be a 
perfect graph and v ^ V. Create a new vertex v' and join 
it to V and to all the neighbors of v. Then, the resulting graph 
G' is perfect. 

Some of the well known perfect graphs that we shall be 
using in this paper are: complete graphs, bipartite graphs, 
split graphs (graphs whose vertices can be partitioned into 
two disjoint sets, which induce a stable set and a clique 
respectively), and disjoint union of perfect graphs. 

It is not hard to imagine that there can be different degree of 
"perfection" in a graph. We can consider two graphs G and H 
where both are not perfect but STAB{G) and QSTAB{G) 
are of approximately equal size while STAB{H) is much 
smaller than QSTAB{H). In such a case, we would consider 
G to be "more perfect" than H. This observation gives rise to 
the need of a metric which measures how perfect a graph is. 
The imperfection ratio lfT2l was introduced precisely for this 
purpose. 

3) Imperfection ratio: In WlL the imperfection ratio 
imp(G) of graph G is defined as 

imp(G) = imn{t : QSTAB{G) C t STAB{G)}. (4) 

In essence, the imperfection ratio measures how much bigger 
the fractional stable set polytope QSTAB{G) is relative to the 
stable set polytope STAB{G). Note that for a perfect graph 
G, imp(G) — 1. Therefore, imp(G) > 1 for any graph G. 

A useful bound on the imperfection ratio is presented in ||T31 
and as Corollary 2.3.5 in 111], which we reproduce below. 

Proposition 1: (Gerke and McDiarmid) For a graph G, if 
each vertex in G can be covered q times by a family of p 
induced perfect subgraphs, then imp(G) < ^. 

We shall later revisit this notion of imperfection of a graph 
when we study the rate regions of multicast switches in Section 
rvl and relate this notion to speedup in switches. 

III. Conflict graphs and network coding 

In a general network, a link may be configured to one 
of several possible states, for instance, by an algorithm that 
computes the schedule or the network code. It is likely that the 
assignment of states to links are dependent on each other In 
Section UlI-AI we present a graph-theoretic model to capture 
this dependence in a general network. In Section IIII-BI we 
apply this approach to the case of multicast switches with 



network coding to define the notion of the enhanced conflict 
graph. We shall use this model in Sections HV] and IVl to obtain 
our main result. 

A. Conflict graph 

Let J\f ~ {V, E) be a directed acyclic graph which rep- 
resents a network. The conflict graph Af' — {V'.E') is an 
undirected graph corresponding to the network M, and is 
constructed as follows: 

• For every link I G E, create a set of vertices f in V 
so that there is a one-to-one correspondence between all 
the possible states s of link I and the vertices W(/,s)- 

« Connect two vertices W(/ sj and if assigning both 

state s to link I and state s' to link V simultaneously is 
impossible. This implies that there is an edge between 
all pairs of and v^i t) where s ^ t since a link 

cannot be assigned two different states simultaneously. In 
more general scenarios, we may need to model conflicts 
using hyperedges to capture cases where a combination 
of states may be incompatible while any subset of them 
could coexist. For instance, given a set of inputs, a node 
can only output a function of those inputs. Thus, if the 
output link state is not compatible with the combination 
of input link states, we connect the vertices corresponding 
to those states with a hyperedge. 
Once we have constructed our conflict graph, a stable set 
represents a collection of states for links such that there is 
no conflict, i.e., it is possible to assign the set of states to 
the links in the network. Thus, a valid configuration in the 
network corresponds to a stable set, and any achievable rate 
can be achieved by time-sharing between the stable sets. This 
means that we can represent the achievable rate region by a 
convex hull of the stable sets, i.e., the stable set polytope of 
the conflict graph. 

Although this conflict graph formulation is easy to concep- 
tualize, it has been noted in |33 1 that the size of a conflict graph 
grows exponentially with the number of possible states for 
each link. Furthermore, the problem of computing the stable 
set polytope of a graph is known to be A^P-hard as discussed 
in Section III-Dll Thus, we do not expect to find an efficient 
algorithm that computes the schedule, given a set of rates in 
polynomial time with respect to the size of the network. This 
motivates us to look into combinatorial and graph-theoretic 
tools to help us understand the structure of the rate region and 
exploit this structure to design efficient scheduling algorithms. 

B. Enhanced conflict graph 

The enhanced conflict graph is a special kind of conflict 
graph introduced by ll33l . which is used to characterize the 
rate region of multicast switches using network coding. The 
enhanced conflict graph G = {V, E) for a traffic pattern is an 
undirected graph defined as follows: 

• For every subflow, create a vertex. 

• Vertices representing subflow {i,J,j) and subflow 
{i', J' are connected if and only if 

- j = /, or 
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- i = i' and J ^ J' . 
In other words, vertices are adjacent if and only if they 
have the same output, or if they are from the same input 
and they belong to different flows. 
The enhanced conflict graph is constructed such that the 
maximal cliques reflect the admissibility condition, which we 
shall formally state and prove in Section IIV-AI To briefly 
discuss the intuition, the constraint that no input should send 
more than one unique packet at a time is represented by the 
edges connecting nodes corresponding to subflows {i,J,j) 
and {i,J',j') where J ^ J'. It is important to note that 
nodes representing subflows from the same flow, for example 
{i,J,j) and {i,J,j') where j ^ j', are not adjacent. This 
is because two subflows from the same flow can be served 
simultaneously, since input i can send a single packet that is 
simultaneously useful to multiple outputs by coding packets 
together. This will be discussed in more detail in the derivation 
of the achievable region. The second constraint that no output 
should receive more than one packet at a time is accounted 
for by the edges connecting vertices of subflows {i, J,j) and 
ii',J',j') where j = j'. 

In addition to encoding the admissibility condition with 
cliques, the enhanced conflict graph also encodes information 
about achievable rate regions. A stable set in an enhanced 
conflict graph represents a set of subflows that can be served 
simultaneously in a valid switch configuration. For instance, 
any subset of the subflows that belong to the same multicast 
flow form a stable set, and they can be served simultaneously. 

Example 4: An example of an enhanced conflict graph of 
the traffic pattern shown in Figure |4] is given in Figure [8] 



0(1, {1,2,3}, I) 




(1, {1,2,3}, 3) (1, {1,2,3}, 2) 

Fig. 8. Enhanced conflict grapli of traffic pattern sliown in Figure |4] 

This graph-theoretic formulation helps us transform any 
given traffic pattern in a multicast switch into an enhanced 
conflict graph, and the properties of this graph can be used 
to derive insight on the rate regions of the switch as shown 
in Section |IV] A similar graph-theoretic formulation was also 
used by Caramanis et al. f5^ in the context of unicast traffic 
in Banyan networks. 

It is important to note the difference between the enhanced 
conflict graph and the conflict graphs introduced in Section 
IIII-AI or in 15|. In a conflict graph, the vertices represent 
configurations of a network, and therefore, conflict graphs are 
not very well equipped to represent fanout-splitting. Reference 
(21 only considers unicast traffic; therefore, their formulation 
naturally does not incorporate fanout-splitting. However, the 
enhanced conflict graphs, by representing subflows with sep- 
arate vertices, naturally incorporate fanout-splitting capability 



of a multicast switch. 

The enhanced confict graph formulation defined above does 
not work for the case of fanout splitting without network 
coding. This is because if coding is not allowed, then it may 
not always be possible to serve subflows from the same flow, 
even though the switch allows the input to be connected to 
multiple outputs. This might happen, for instance, when each 
output wants a different packet. Without coding, it is not 
possible to satisfy multiple outputs with a single packet, even 
if the input is connected to all the outputs. 

For the case of fanout-splitting without coding, Marsan et 
al. Il26l gave a characterization of the rate region as the convex 
hull of certain modified departure vectors. However, this 
formulation does not have a neat graph-theoretic interpretation 
in general. On the other hand, allowing network coding not 
only increases throughput, but as we will show in the Section 
IIVI it also leads to a simpler description of the rate region and 
enables the use of graph-theoretic tools. 

C. Properties of enhanced conflict graph of a multicast switch 

In this section, we describe some interesting proper- 
ties/structure of the enhanced conflict graph of a multicast 
switch. We show that the class of graphs that are the enhanced 
conflict graph of some multicast traffic pattern does not cover 
the class of all possible graphs - i.e., there is some structure 
to the enhanced conflict graph. This is of interest from an 
algorithmic perspective. Since the enhanced conflict graph can 
be used to characterize the rate region of multicast switches 
(see Section IIVI ), it is useful to understand the structure of 
the enhanced conflict graph in order to determine whether it 
is possible to develop efficient algorithms to compute the rate 
region and schedules for multicast switches. 

As mentioned in Section IIII-BI in an enhanced conflict 
graph of a multicast switch, conflicts between a pair of 
subflows exist due to one or both of the following two reasons: 

• The two subflows go to the same output. 

• The two subflows originate at the same input, and belong 
to different flows. 

This constrains the structure of the enhanced conflict graph for 
multicast switches. First, we make the following observation 
about subflows arising at the same input. 

Lemma 2: In the subgraph induced by subflows from the 
same input, C0-P3 is a forbidden subgraph, where C0-P3 is 
given in Figure |9] 

Proof: First, we argue that subflows from the same input 
induce a complete multipartite graph. Consider subflows from 
each flow at the input as a different partition. By the way edges 
are defined, subflows from the same flow do not conflict, but 
any two subflows of two different flows do conflict, resulting 
in a complete multipartite graph. 

A complete multipartite graph cannot contain C0-P3 as an 
induced subgraph. This is because, any two vertices that are 
not adjacent in a complete multipartite graph must be from the 
same partition. Referring to Figure |9] vertices A and B must 
be from the same partition. Similarly, vertices A and C must 
also be from the same partition. Therefore, B and C must 
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be from the same partition, which means they should not be 
connected to each other - a contradiction. ■ 

Lemma 3: For any k > 1, if the enhanced conflict graph 
has a set S of k vertices such that no two of them are adjacent 
to each other but they all have a common neighbor v, then at 
least (k — 1) of the vertices in S must represent subflows from 
the same input as v. 

Proof: Every vertex in S* is a neighbor of v. So, it either 
has the same output as v, or is from the same input as v but 
from a different flow. Suppose the given statement is false. 
Then, at least two vertices in S must represent subflows to 
the same output as v. However, this would imply that these 
two vertices must be connected to each other, as they conflict 
at the output, which leads to a contradiction. ■ 

1) Forbidden graphs in the enhanced conflict graph: We 
now present a collection of graphs that can never occur as a 
subgraph in the enhanced conflict graph of any traffic pattern 
in a multicast switch. 

Theorem 3: The webbed claw ( shown in Figure \10\l cannot 
occur as an induced subgraph in any enhanced conflict graph. 

Proof: Suppose there is a traffic pattern in whose en- 
hanced conflict graph, the webbed claw appears as an induced 
subgraph. Let i be the input of subflow represented by the 
vertex E in Figure [TO] Then, vertices B, F, and D are 3 
neighbors of E that are not adjacent to each other By Lemma 
[51 at least two of them must have input i. We consider the 
following cases: 

1) B and D are from input i. 

Now, A and C are two non-adjacent neighbors of E. 
Again using Lemma [3] one of them must represent a 
subflow from input i. Without loss of generality, let this 
be A. Now, A, B and D are from the same input and 
they induce a C0-P3. This contradicts Lemma |2] 

2) D is not from input i. 

Then, B and F must both be from input i. A and D 
are two non-adjacent neighbors of E. So, by Lemma 
|3] at least one of them is from input i. Since D is not 
from input i, A must be from input i. Now, A, B and F 
are from the same input and they induce a C0-P3. This 
contradicts Lemma |2] 

3) B is not from input i. 

By symmetry, this is essentially the same as Case 2. 

Thus, we get a contradiction in all cases. This completes 
the proof. ■ 

Theorem 4: The connected double diamond ( shown in Fig- 
ure llU cannot occur as an induced subgraph in any enhanced 
conflict graph. 



Fig. 10. The webbed claw 



G F 




Fig. 1 1 . The connected double diamond 

Proof: Let i be the input of subflow C. Now, B,D,F 
and G are neighbors of C, no two of which are connected to 
each other Hence, using Lemma |3] at least three of them have 
the same input i as C. Without loss of generality, let B, D 
and F be from input i. 

D and F are non-adjacent neighbors of E. Hence, again 
by Lemma |3] at least one of them has the same neighbor as 
E. But, both D and F have input i. This means E must also 
have input i. 

Now, B, E and F are subflows from the same input i and 
they induce a C0-P3. This contradicts Lemma|2l Therefore, the 
connected double diamond cannot be an induced subgraph of 
any enhanced conflict graph. ■ 

It can be seen that the Grotzch graph (shown in Figure 
[TST i contains the connected double diamond as an induced 
subgraph - vertices A to G induce a connected double 
diamond. This implies the following. 

Corollary 1: The Grotzch graph cannot occur as an in- 
duced subgraph in any enhanced conflict graph. 

Proof: Therefore, this theorem follows from Theorem |4] 

■ 

It is interesting to note that the Grotzch graph is the third 
graph (G2) in a sequence of graphs, called the Mycielski 
graphs II2TI . Mycielski graphs Go, Gi, G2, ... form a series 
of graphs with ijj{Gi) = 2 for all i, but xi^i) = 2 + i. In 
addition, Gi contains Gi_i as an induced subgraph. 




Fig. 12. The Grotzch graph 
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Corollary 2: Mycielski graphs Gi, for all i > 2, cannot 
occur as an induced subgraph in any enhanced conflict graph. 

Mycielski graphs are used to prove that there is no upper 
bound on the imperfection ratio of a general graph ||2T1 . My- 
cielski graphs form a sequence with unbounded imperfection 
ratio, i.e., imp(Gi) oo for i ^ oo. Therefore, the fact that 
the enhanced conflict graph does not contain the Mycielski 
graphs means that we cannot yet rule out the possibility that 
the imperfection ratio of the enhanced conflict graph may be 
bounded as the size of the switch grows. 

IV. Rate region of a multicast switch 

In the next few subsections, we discuss how the stable set 
polytope of the enhanced conflict graph is related to the rate 
region of the switch. 

Let r e be the rate vector of a traffic pattern that has / 
flows. We call the collection of all achievable and admissible 
rate vectors as the achievable rate region R C K+-^ and 
admissible rate region A C R_|_-^ respectively. For r G R, 
we can construct a switch schedule, which can be viewed as 
a time sharing between valid switch configurations (i.e., rate 
decomposition). 

Suppose that the total number of subflows in the traffic 
pattern r is m. Then, the enhanced rate vector e(r) S M"* 
corresponding to r is defined as: 

euj{r) = Tjj, for afl j e J. 
Therefore, enhanced rate vector is just an extended version 
of the rate vector so that each flow is duplicated as many 
times as the number of its subflows. We use the enhanced rate 
vector as weights for vertices of the enhanced conflict graph. 
As mentioned in Section IIII-BI subflows that can be served 
simultaneously are not adjacent. Therefore, in an enhanced 
conflict graph, a valid switch configuration corresponds to a 
stable set, and a switch schedule corresponds to a convex 
combination of stable sets of the enhanced conflict graph G. 
This allows us to draw a connection between the stable set 
polytope of the enhanced conflict graph and the rate regions 
of the multicast switch. 

A. Admissible rate region of a multicast switch 

In this section, we draw a connection between the fractional 
stable set polytope QSTAB{G) of the enhanced conflict 
graph G and the admissible rate region of the multicast 
switch. For a general graph, a complete characterization of the 
stable set polytope in terms of linear inequalities is unknown. 
However, the fractional stable set polytope QSTAB{G), a 
canonical relaxation of STAB{G), can be described by the 
clique inequalities along with non-negativity constraints, as 
mentioned in Section III-Dll 

Theorem 5: For any rate r £ A, the enhanced rate vector 
e(r) G QSTAB{G), i.e., if a non-negative rate vector r 
satisfles the admissibility conditions, then its enhanced rate 
vector e(r) satisfies the clique inequalities of the enhanced 
conflict graph G. 

Proof: Consider any maximal clique G in G. Then, by 
the construction of the enhanced conflict graph, the vertices 
in G represent subflows that all start from the same input, or 



all end at the same output, or both. We prove this statement 
below in two cases: |C| = 2 and |C| > 2. 

Consider the case where \G\ = 2, i.e., G — {vi,V2}, 
where vi and V2 are vertices of G corresponding to subflows 
{ii, Ji,ji) and (12, J2,j2) respectively. By construction of the 
enhanced conflict graph G, one of the following must be true: 
either ii — 12 or ji = j2. This proves the statement. Therefore, 
we only need to consider |C| > 2. 

Now consider two vertices u and u' E G such that they 
represent subflows {i,J,j) and {i',J',j') respectively where 
i = i' OT j — j' but not both. (If such a pair of vertices u 
and u' does not exist in G, then G only includes vertices that 
start from the same input as well as end at the same output, 
making the statement trivially true.) Suppose i ^ i' but j ^ j'. 
For any other vertex v € G, v must conflict with both u and 
u' since C is a clique. Now, since u and u' have different 
outputs, V can have an output side conflict with at most one 
of them. Therefore, v must have an input-side conflict with u 
and u'. This implies that all vertices in G represent subflows 
starting from the same input i. A similar argument holds for 
the case of i ^ i' but j — j'. In this case, all subflows in G 
will have the same output. 

Thus, for any maximal clique G in G, the vertices in G 
represent subflows that all start from the same input, or all 
end at the same output. Now, if r e A, then no input or 
output is overloaded, i.e., the sum of rates of flows starting 
from a given input or destined to reach a given output must 
be less than 1. By the way the enhanced conflict graph was 
defined, at most one subflow from a given flow can be part 
of a clique. Therefore, the admissibility condition implies 
the clique inequalities. The non-negativity conditions of r 
carry over to e(r). Thus, the enhanced rate vector e(r) is 
in QSTAB{G). ■ 

Thus, QSTAB{G) corresponds to the admissible rate 
region of the multicast switch, i.e., A is a projection of 
QSTAB{G). 

B. Achievable rate region of a multicast switch 

In this section, we will establish the achievable rate region 
in a multicast switch, in terms of the enhanced conflict graph 
of the underlying traffic pattern. 

We first present a few definitions. Since we only consider 
linear coding across packets of the same flow (intra-flow 
coding), the state of knowledge of a switch input or output 
with respect to a particular flow can be represented as a vector 
space, and the backlog of knowledge between an input and 
output can be represented as a virtual queue, in the same 
way as described in |35|. We restate these definitions here 
for completeness. 

The vector of coefficients used in the linear combination of 
packets summarizes the relation between the coded packet and 
the original stream. For a given flow, a node can compute any 
linear combination whose coefficient vector is in the linear 
span of the coefficient vectors of previously received coded 
packets from that flow. Thus, the state of knowledge of a node 
with respect to a flow can be defined as follows. 

Deflnition 17 (Knowledge of a node): The knowledge of a 
node with respect to a particular flow at some point in time is 
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the set of all linear combinations of the original packets of that 
flow that the node can compute, based on the information it 
has received up to that point. The coefficient vectors of these 
linear combinations form a vector space called the knowledge 
space of the node, with respect to that flow. 

Definition 18 (Innovative Packet): A packet transmitted 
from an input to an output is said to be innovative if it 
conveys previously unknown information to the output. For 
linear coding, this means that the coefficient vector of the 
packet is linearly independent of coefficient vectors of all 
coded packets of that flow received previously by the output, 
thereby conveying a new degree of freedom. In other words, 
the coefficient vector is outside the output's knowledge space 
for the corresponding flow. 

Associated with every subflow is a virtual queue that 
represents the backlog of knowledge between the input and 
the output with respect to the corresponding flow. The formal 
definition is as follows. 

Definition 19 (Virtual Queue): The size of the virtual 
queue associated with the subflow {i,J,j) is equal to the 
difference between the dimension of the knowledge space of 
input i and that of output j with respect to the flow {i, J). 

From this definition, it follows that an arrival to a subflow's 
virtual queue occurs when a packet arrives into the correspond- 
ing flow's physical queue. This also means that an arrival rate 
vector r for the flows translates to a rate vector of e(r), the 
enhanced rate vector of r, for the virtual queues. A departure 
(or service) occurs when an innovative packet is conveyed for 
that subflow. Thus, the size of the virtual queue represents the 
number of degrees of freedom that still need to be conveyed to 
the output, in order to communicate all the packets that have 
arrived so far. 

1) The scheduling and coding strategy: We will consider 
frame-based schedules. A frame refers to a set of F consecu- 
tive slots, where F is the frame size. Frame-based schedules 
are schedules that can be specified by a sequence of F switch 
configurations such that the switch cycles through these con- 
figurations periodically. We also call these offline schedules, 
since the schedule is decided based on prior knowledge of the 
arrival rates of the flows, and does not use the instantaneous 
queue size information to decide the switch configuration. We 
begin with a theorem that provides service guarantees for a 
certain set of rate vectors. More specifically, we show that in 
each frame, every queue receives enough service opportunities 
to match the arrival rate. 

Let qij{n) be the size of the physical queue of flow {i, J) 
at the end of the n*'* frame. Let aij{n) denote the number 
of arrivals into the queue of flow (i, J) during the n*'' frame. 
Without loss of generality, we assume that the rates of all the 
flows are rational. 

Theorem 6: Consider a traffic pattern with a rate vector 
r and an enhanced rate vector e. Suppose fanout splitting 
and linear network coding are allowed. Then, the following 
statements are equivalent: 

1) e £ STAB{G), where G is the enhanced conflict graph 
of the traffic pattern. 

2) There exists a coding scheme and a frame-based sched- 
ule with a frame size F such that for every flow (i, J), 



rijF is an integer, and the oldest rijF packets that were 
in the flow's queue at the end of frame (n — l) are served 
by the end of frame n, for all n > 1. If there were fewer 
than TijF packets, then all of them are served. 

(Here, 'served' means that these packets are conveyed to all 
outputs in the fanout of the flow and removed from the flow's 
queue.) 

Proof: Proof of 1 => 2: We will present a schedule and 
a coding scheme that ensure that the queues are served as in 
the theorem statement. In our scheme, the arrivals during a 
frame are not processed till the beginning of the next frame. 
The proof is by induction on the frame number n. 

Basis step: The queues are assumed to be empty initially, 
hence there are no packets at the end of frame and the 
requirement is trivially satisfied for n = L 

Induction hypothesis: Assume the property holds for frame 
fc, for all 1 < fc < rt. 

Induction step: Consider frame (n+ 1). By hypothesis, e G 
STAB{G). So, we can express e as a convex combination of 
the incidence vectors of stable sets of the graph: 

m 

(=1 

where x'^' denotes the incidence vector of the stable set Si, 
X^i 'pi = ^ (f)i > for all i. 

Since all rates are assumed to be rational, we can always 
pick a frame size F such that VijF is an integer for all flows. 
Assuming the 0i's are rational, we can choose F such that (j>iF 
is also an integer for all i. Using this F as the frame size, we 
construct a frame-based schedule by appropriate time-sharing 
among the different switch configurations represented by the 
stable sets. Thus, out of F slots in a frame, the switch is 
configured to stable set Si for (fiiF slots, for each i. In each 
slot, the stable set specifies to which outputs each input is to 
be connected, and which flow is meant to be served over that 
connection. 

This schedule has the property that for each flow {i, J), 
each output j in its fanout set J is connected to input i 
in exactly VijF slots during one frame. However, owing to 
fanout spUtting, different outputs in J may be connected in 
different sets of slots, with possible overlap. Of the F slots in 
the schedule, let Tij be the total number of slots when input 
i is connected to at least one of the outputs in J, for serving 
flow (i, J). Thus, Tij is in general more than r,;jF due to 
fanout splitting. 

We propose a coding scheme that uses a maximum distance 
separable (MDS) code [251. The key property of an MDS code 
that we use here is that an (n, k) MDS code can correct up 
to (n — k) erasures, each of which may occur anywhere in 
the codeword. Hence, using any k codeword symbols one can 
retrieve all the information. 

In order to guarantee the service of the queues as in 
the theorem statement, the algorithm must serve the oldest 
min {qij{n),rijF) packets from the queue of flow (i, J), for 
each flow. In our coding scheme, the input uses a (Tu, rijF) 
MDS code. 
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The information word has rijF symbols (packets) and is 
chosen as follows. If qi,j{n) > rijF, then the oldest r^j pack- 
ets are chosen as the information word. If qij{n) < VijF, then 
the oldest qij{n) packets are chosen along with VijF — qij{n) 
dummy all-zero packets, to form the information word. The 
input computes the MDS codeword treating these I'ijF packets 
as symbols of the information word. The resulting codeword 
symbols are sent at each of the Tij transmission opportunities. 
Since each output in the fanout of {i, J) is guaranteed to 
receive r^jF codeword symbols, it can retrieve all the rijF 
packets in the information word. The schedule and the code 
are computed offline, and are known to all inputs and outputs. 
This proof assumes a mechanism that helps outputs to identify 
dummy packets that may have been added while forming the 
information word. 

Proof of 2 =^ 1: This proof was given in 1341 . and is 
summarized here for completeness. Suppose there is a frame- 
based schedule of switch configurations and an associated code 
such that the requirement in statement 2 of the theorem is met. 
Consider an arbitrary flow {i, J). Satisfying the requirement in 
statement 2 implies that when there are r.ijF or more packets 
in the queue at the beginning of a frame, the schedule is 
capable of conveying to every output j e J, rijF innovative 
packets or degrees of freedom from that flow by the end of 
the frame. 

Based on this achieving schedule, form F indicator vectors, 
one for each slot. Each vector has one entry for every subflow 
such that, the entry is a 1 if the schedule conveys an innovative 
packet for that subflow in that slot, and otherwise. These 
vectors can be viewed as indicators for whether each virtual 
queue received a service or not in that slot. Adding these 
indicator vectors over all the F slots must then give F times 
the enhanced rate vector, since the requirement is satisfied 
for every subflow. In other words, e is the average of such 
indicator vectors over all the F slots in the schedule. But, if 
a set of subflows receive an innovative packet in the same 
slot, then they must first of all, be conflict-free in terms of 
the switch constraints, i.e., each indicator vector has to be the 
incidence vector of some stable set of the enhanced conflict 
graph. Therefore, e can be written as a convex combination of 
the stable sets. This proves that statement 2 implies statement 
1. ■ 

In the above proof, the schedule ensures that each input 
gets to talk to each output for enough fraction of time about 
each flow. To make sure that every transmission opportunity 
is used to convey a new degree of freedom, we need to use 
an appropriate code. One way to do this is the MDS code 
idea described in the proof. However, in general, for an {n, k) 
MDS code to exist, we need to work over a large field size, 
comparable to n. 

Since we view the packets as symbols over a finite field 
while computing the code, the field size is a parameter of 
interest. Using an MDS code might require a field size that 
depends on the length of the schedule, which is not desirable. 
If the field is too large, then we may need more than one 
packet to represent a single field element, which makes the 
implementation more difficult. On the other hand, the field 
should be large enough to ensure that every transmission 



conveys an innovative packet to all the recipients whenever 
possible. We will now show an alternate coding strategy that 
avoids the large field size requirement of MDS codes, and yet 
achieves the desired innovation guarantee. This coding scheme 
is based on earlier results of |T5l and flSl on multicasting 
using network coding. Using this approach, for reasonable 
assumptions on the switch size and the packet size, the field 
size required will be such that a field element will indeed fit 
within one packet. 

Proposition 2: In the proof of Theorem^ afield size equal 
to the maximum fanout size is sufficient to ensure that every 
transmission is innovative to all recipients, except those that 
have already received the packets that were in the queue at 
the beginning of the frame. 

Proof: We use the same notation as in the proof of 
Theorem|6] Consider a network with three layers of nodes. The 
first layer has a single node - the source. The second layer 
nodes correspond to those time-slots in the frame in which 
flow (i, J) is being served. Thus, there are Tij such nodes. In 
the third layer, there is one node corresponding to each output 
in the fanout of flow {i,J). The source node is connected to 
all nodes in the second layer. A "slot-node" in the second layer 
is connected to those "output-nodes" of the third layer which 
are served in the corresponding time-slot. All links have unit 
capacity. Consider the single source multicast problem with 
network coding, from the source node to all nodes of the 
third layer Since the schedule guarantees that every output 
receives VijF transmissions, this means the min-cut of this 
network is VijF. Therefore, using the results of |15| and flSl, 
TijF packets can be transmitted to each output using network 
coding, and the field size required is equal to the number of 
destinations, which in our case is the size of the fanout. The 
network coding solution to this new network naturally leads 
to the code for the switch. ■ 

2) The rate region: The schedule used in the above proof 
suggests the following algorithm - after every F slots, remove 
rijF packets from each queue (j, J) and serve them over 
the next F slots using an MDS code. This algorithm, viewed 
at the time-scale of frames (rather than slots), guarantees 
deterministic service to each queue with a rate of rijF packets 
per frame. Essentially, it ensures the following evolution for 
the queue of flow {i,J): 

qij{n + 1) = {qij{n) ~ njF)^ + aijin) 

Working at the level of frames, we use the above theorem 
to establish the rate region for a multicast switch with fanout 
splitting and network coding, under fairly general assumptions 
on the arrival process. 

We use the same assumptions on the arrival process as in 
Definition 3.4 of 1 10 1: 

. limf^oo 7 I]r=o^{aa(T)} = nj- 

. E[a,j{tf\H{t)] < A^^^^ for aU frames t, where H{t) 

represents the history up to frame t. 
« For any 5 > {), there exists T such that for any t^. 



E 



< n.j + 5 



^Y.k=>u{to + k\H{t^)) 
The type of stability we consider is also the same as in 
Chapter 3 in fTOl , i.e., strong stability - a queue is strongly 
stable if it has a finite time average expected backlog. 
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Definition 20 (Rate region): For a given traffic pattern, a 
rate vector r is said to be achievable if there exists a schedule 
and a coding scheme that ensure that all the virtual queues 
are strongly stable. The set of all achievable rate vectors of a 
given traffic pattern is called the rate region for that pattern. 

In Lemma 3.6 of [101, the necessary and sufficient condi- 
tions for the strong stability of a single queue under admissible 
arrival and service processes are given. Applying those results 
in the present context, we arrive at the following result. 

Corollary 3: The rate region R with linear network coding 
is given by the set of all rate vectors r such that, the enhanced 
rate vector e(r) G STAB{G) where G is the enhanced 
conflict graph. 

Given the set of rates of the various flows in an achievable 
traffic pattern, the switch schedule can be obtained using a 
graph-theoretic approach that is discussed further in Section 
IVI-AI For an arbitrary pattern, this is likely to be a hard 
problem, as it involves certain coloring problems on the 
enhanced conflict graph. As mentioned in Section III-Dll a 
complete characterization of STAB{G) in terms of linear 
inequalities is unknown for a general graph G. Note that, 
for most graphs, STAB{G) C QSTAB{G), since the clique 
inequalities are necessary but not sufficient conditions for a 
stable set polytope. Thus, the admissible region is often a strict 
superset of the achievable rate region, which implies that it is 
not possible to achieve 100% throughput even with coding - 
we need speedup. We shall use this connection between the 
conflict graph and rate regions to draw insights into what kind 
of benefit network coding gives us in terms of speedup in 
Section |V] 

V. Network coding and speedup 

In this section, we study the effect of allowing network 
coding on the speedup requirement in multicast switches. From 
Section lTl-C3l we know that a switch is said to have a speedup 
s if the switching fabric can transfer packets at a rate s times 
the incoming and outgoing line rate of the switch. This means 
the switching fabric can go through s configurations within one 
slot. Given a traffic pattern, an important quantity of interest is 
the minimum speedup required to sustain all admissible rates, 
i.e., to achieve 100% throughput. We denote this as the s,nin 
for that traffic pattern. From the definition of speedup, it is 
easy to see that a rate vector r is achievable with speedup s if 
and only if it is admissible and is within the achievable rate 
region. Using this fact, s,„i„ is simply the smallest value of s 
such that -jT is within the rate region for all admissible rates 
r. If we denote the admissible and achievable rate regions as 
A and R respectively, then Smin — min{s | A C s R}. 

The section is organized as follows. We first present a 
special traffic pattern for which the value of Smin is lower 
bounded by around 1.5 without coding, but is exactly 1 (i.e., 
no speedup) with coding. Then, we present a graph-theoretic 
bound on Smin for a general traffic pattern in a if x TV switch. 
Finally, we present numerical simulation results that quantify 
the actual benefit of network coding in terms of the rate region 
and speedup. 




Fig. 13. A traffic pattern wiiicii demonstrates the benefit of coding 

A. Network coding reduces speedup required: An example 

In Figure m we akeady saw an example of a traffic pattern 
which can be achieved with network coding but requires a 
speedup otherwise. In this section, we present a generalization 
of that example and explicitly quantify the minimum speedup 
needed to support all admissible traffic rates with and without 
coding. 

The traffic pattern we consider is shown in Figure [13] It is 
a 2 X iV switch with one broadcast flow from input 1 with 
rate ro and a unicast from input 2 to every output i with rate 
ri, for i G [N]. (We use the notation [m] to denote the set of 
integers from 1 to m.) 

In order to understand the reason for the benefits of coding, 
we first study a special rate point for this traffic pattern, shown 
in Figure [m set ro = (l - and i\ = for all i £ [N]. 
This means that on average, over a period of N slots, {N — 1) 
packets for the broadcast flow and one packet for each unicast 
flow must be served. This is clearly an admissible set of rates. 
It can be seen that Figure |4] corresponds to the special case of 
A^' = 3. 











1 











Fig. 14. A special rate point in the traffic pattern of Figure [13] 

This rate point cannot be achieved with fanout-splitting 
alone. In every slot, one of the unicasts from input 2 has to 
be served since input 2 has total inflow of rate 1 and can 
therefore never be idle. Hence, input 1 needs at least two 
slots to completely serve each of its broadcast packets. So, 
it requires at least 2{N — 1) slots to serve (N — 1) packets. 
This is greater than N for N > 2. Thus fanout-splitting 
without coding cannot achieve a rate of (l — -i-). A speedup 
is required to achieve this rate point. 

On the other hand, this traffic pattern is achievable if 
network coding is allowed. The schedule is similar to that 
shown in Figure |4] During a frame of N slots, input 2 serves 
the unicasts sequentially starting from output 1 to output N 
for one slot each, thus achieving the required rate of per 
unicast. In parallel, input 1 serves the broadcast as follows. In 
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every slot from 1 to iV — 1, it sends a new packet from the 
broadcast flow to all the outputs except the one occupied by 
input 2 during that slot. Finally, in the A^*'' slot, it combines all 
the previous iV — 1 packets using an XOR operation and sends 
this linear combination to all available outputs. This schedule 
ensures that output N receives all — 1 packets directly. 
In addition, the remaining outputs 1, 2, ... — 1 also receive 
enough information to decode all iV — 1 packets. Each of these 
outputs receives A^— 2 different packets and one XORed packet 
and can then decode the one remaining packet by applying an 
XOR operation on all the packets it has received. Thus, A^ — 1 
packets are delivered over a period of A^ slots and input 1 
successfully completes the broadcast requirement. 

Next, we formally quantify the benefit network coding 
provides compared to fanout-splitting for this specific traffic 
pattern in terms of the speedup required for achieving all 
admissible rate points. 

To analyze the performance of a network coding switch 
with the traffic pattern in Figure [13] we present a theorem 
that identifies a key property of the enhanced conflict graph 
for this traffic pattern. 

Theorem 7: The enhanced conflict graph for the traffic 
pattern shown in Figure \T3\ is a perfect graph. 

Proof: The enhanced conflict graph consists of a set of 
A^ subflows from the broadcast from input 1 at rate ro, and 
a set of A^ subflows corresponding to the unicasts from input 
2. The unicast subflows form a clique, while the broadcast 
subflows form a stable set. Thus, the graph is a split graph, 
which is known to be perfect. ■ 

Now, for a perfect graph G, QSTAB{G) ^ STAB{G). 
Therefore, comparing Theorem |5] and Corollary [3] we see that 
the admissible region coincides with the achievable rate region. 
This leads to the following corollary. 

Corollary 4: For the traffic pattern shown in Figure [75] 
the entire admissible rate region is achievable without any 
speedup if linear network coding is allowed. 

Next, we consider the performance of a fanout-splitting 
switch given the traffic pattern in Figure [13] The rate region 
of this pattern with fanout-splitting but not coding is given 
in Theorem [8] {Note: By rate region, we mean the set of 
rate vectors for which we can satisfy the same requirement 
as in statement 2 of Theorem [6] The connection to the strong 
stability of queues can be made in a manner similar to the 
discussion in Section IIV-B2I ) 

Theorem 8: The achievable rate region of the pattern shown 
in Figure \T3\ with fanout-splitting but no coding is given by 
the following set of inequalities. 



ri>0 for 0,1,.. .N (5) 

^r, <1 (6) 



N 



ro + n<l for 1,2,. ..N (7) 

N 

2ro + 5]r, <2 (8) 

1=1 

The proof is given in the appendix. Note that the conditions 
([6]l and ([7]i are the admissibility conditions. The presence of 



an additional constraint ([8]) shows that fanout splitting does 
not achieve all admissible rates. 

We now revisit the special rate point considered earlier: 
7-0 = (1 — ■^); = -i. for all i G [A^]. Indeed this rate point 
violates the inequality given in Equation [8] thereby confirming 
that this point does not lie within the rate region for fanout 
splitting without coding. The left hand side evaluates to (3 — 
jj:), while the right hand side is only 2. Hence, the smallest 
scaling factor such that the rate vector lies inside the scaled 
rate region is (1.5 — ;^). This leads to the following corollary. 

Corollary 5: A speedup of at least (1.5 — jj) is needed 
to sustain all admissible traffic (i.e., to guarantee 100% 
throughput) for the traffic pattern in Figure \T3\ with fanout- 
splitting but no coding. 

In other words, we have demonstrated a traffic pattern for 
which all admissible rates are achievable with no speedup if 
network coding is allowed, but this needs a speedup of (1.5 — 
-^) if coding is not allowed. A natural question that follows is 
- how much speedup benefit does network coding provide for 
a general traffic pattern? In particular, does it always achieve 
all admissible rates? 



B. A lower bound on speedup with network coding 

As shown above, network coding can make otherwise 
unachievable traffic patterns achievable; however, there are 
admissible traffic patterns that are still unachievable even if 
we allow network coding. We already presented an example 
in Figure [6] We now study this example in greater detail. 

Example 5: The traffic pattern in Figure [6] cannot be 
achieved even when network coding is allowed - we need 
other capabilities such as speedup to achieve this traffic 
pattern. To explain this, we consider the enhanced conflict 
graph of this traffic pattern as shown in Figure [15] Here, Uij 
represents the unicast flow vertex from input i to output j, and 
the bij represents the broadcast subflow vertex from input i to 
output j. The enhanced conflict graph contains an odd hole; 
hence by Theorem [2] it is not perfect. Thus, from Section 
Unl we know that the achievable rate region is smaller than 
admissible rate region; the switch needs speedup to achieve 
this traffic pattern even if we have network coding. 




Fig. 15. A traffic pattern which requires speedup and its enhanced conflict 
graph 

It turns out that the traffic pattern in Figure [15] requires 
a speedup of 1.25 with network coding. To understand why, 
we consider the description of the stable set polytope of the 
enhanced conflict graph. As mentioned in Section ITl-D II there 
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are many necessary conditions for a stable set polytope, such 
as the odd hole constraints: 



r e A 



\H\ 



(9) 



where H is an odd hole. 

We observe that in Figure [15] each vertex in the odd hole 
represents a flow of rate 1/2. Therefore, the total weight on 
the odd hole is 5/2, which is the total rate the switch needs 
to serve to satisfy the subflows represented by the vertices in 
the odd hole. However, the right-hand side of Equation |9] is 
[|i/|/2j = L^/2J = 2. Hence, the smallest scaling factor such 
that the rate vector satisfies the scaled odd hole constraint is 
5/4 = 1.25. Therefore, a speedup of at least 1.25 is needed 
to serve this traffic pattern in a network coding switch. 

On the other hand, we show that this traffic pattern only 
requires speedup of at most 1.25 when network coding is 
allowed. To demonstrate, we present a schedule in Figure [16] 
Here, the switch serves two packets for each flow. To achieve 
the required rate of 1/2, this should take 4 slots. However, 
the switch actually uses 5 configurations. Therefore, the 5 
switch configurations have to be mapped to 4 actual slots, 
which requires a speedup of 1.25. Hence, this shows that the 
speedup needed to achieve this traffic pattern is exactly 1.25. 

In the rest of this section, we seek to quantify the minimum 
speedup Sniin needed to achieve any admissible rate point for 
an arbitrary traffic pattern in a switch that uses network coding. 
Note that we already have a lower bound - the traffic pattern 
in Figure [TS] implies that s^i^ > 1.25. We will next provide 
a upper bound on s,nin. 



C. Imperfection ratio bounds speedup 

This section develops our main result, which relates speedup 
with imperfection ratio |20|. The key observation here is that 
if the enhanced conflict graph G is perfect, then by definition 
STAB{G) = QSTAB{G). In this case, the problem of com- 
puting STAB{G) becomes easy, and therefore, computing the 
achievable rate region of a switch is easy as well. In addition, 
as noted in Section III-D3I the less "imperfect" a conflict 
graph is, the closer the stable set polytope is to the fractional 
stable set polytope. Therefore, imperfection ratio translates to 
how close the achievable rate region is to the admissible rate 
region. Thus, understanding and measuring the perfectness of 
the enhanced conflict graph is a useful way of gaining insight 
into the benefit of network coding. The relation between the 
imperfection ratio and the speedup is stated formally below. 

Theorem 9: Given a traffic pattern, let G be its enhanced 
conflict graph and s,nin be the minimum speedup required to 
achieve all admissible rates. Then, 

Smin < imp(G). 

Proof: Let A and R denote the admissible and achievable 
rate regions for the given traffic pattern. 



e(r) e QSTAB{G) (Theorem O 
e(r) e \mp{G)STAB{G) (by definition of imp(G)) 
1 

imp(G)' 



r e STAB{G) (e(-) is linear) 



1 



imp(G) 



r e R (Coronary [3]l 



This implies that A C imp(G)R and the result follows. ■ 
Note that the converse of Theorem [9] is not true. This 
is because enhanced conflict graph G replicates a multicast 
flow into subflows, and as a result, induces a stable set 
polytope of dimension greater than the number of actual 
flows in the traffic. Thus, A and R are projections of 
QSTAB{G) and STAB{G) such that the subflows corre- 
sponding to the same multicast flow have the same weight. As 
a result, QSTAB{G) C \mp{G)ST AB{G) impHes the A C 
imp(G)R, but A C Smi„R may not imply QSTAB{G) C 
s^i^STAB{G). 



D. Bounds on speedup for K x N switch with unicasts and 
broadcasts 

In this section, we apply Theorem ^to K x N switches 
using intra-flow coding with traffic patterns consisting of 
unicasts and broadcasts only. We show that the minimum 
speedup needed for 100% throughput in this case is bounded 
by m^n{^^^ , ■^^)- The rest of this section is organized as 
follows. First, we give a description of the enhanced conflict 
graph for a K x N switch. In Sections IV-D2I and IV-D3I we 



2 K- 1 
K 



and 



show the upper and lower bounds on speedup of 
respectively. 

1 ) Enhanced conflict graph for K x N switch: Consider 
traffic patterns which consist only of unicasts and a broadcast 
per each input on a X x TV switch. In such a case, the enhanced 
conflict graph denoted Gk^n = {V, E) has the following 
structure. (We use the notation [to] to denote the set of integers 
from 1 to TO.) 

Each vertex in Gk,n represents a subflow in a K x N 
switch. The vertex Uij represents the unicast flow from input i 
to output j, and the vertex bij represents the broadcast subflow 
from input i to output j. As an example. Figure [TT] shows the 
switch configuration corresponding to un, U21, and 612 in a 
2x3 switch. Thus, the vertex set is given by 



V 



where 



{u,, I J G m, 
\ie[K]}, 



U° 



5; 



:= {h, I J e [N]}, 



Thus, Ui and U° are collections of the unicast flows from 
input i and to output j respectively. Bi and B° are collections 
of the broadcast subflows from input i and to output j 
respectively. 
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5 time slots to complete 2 packets for each flow 
























packet a 










packet b 

5^ 


^ - 






a + b 











time 1 time 2 

Fig. 16. A traffic pattern that requires speedup in a network coding switch 
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of from Input 1 to 
output 2 



Fig. 17. Switch configuration corresponding to un, U21, and 612 in G2,: 



The intuition behind a conflict graph is that vertices which 
represent flows that cannot be served simultaneously are 
adjacent. Note that if fanout splitting and network coding 
are allowed, the switch can simultaneously serve two or 
more subflows of the same broadcast flow and hence such 
subflows are not adjacent to each other Hence, the edge set 



E 



Ef 
Ef 



U,e[K]Er) U {ii,e[K]E!) U {U,e[N]E°) where 

= {{uij,Uik) I j + k;j,k e [N]}, 

= {{bij,Uik) \ j,k (E [N]}, and 

^ {{uji,Uki),{bj.i,bki),{bji,Uki) I j^k;j,ke [K]} 



Each edge set represents a different type of conflict. Ef 
represents conflicts among unicasts at input i; E^ represents 
conflict between any broadcast subflow and any unicast at 
input i; and Ef represents conflicts among all flows and 
subflows at output i. 

From the input perspective, Gk,n consists of K induced 
complete subgraphs GK,N{Ui) for unicasts from each input 
i, and K induced stable sets GK.N(Bi) for broadcasts from 
each input i; from the output perspective, Gk,n consists of 
N induced complete subgraphs GK,N{Uj U Bj) for unicasts 
and broadcast subflows to output j, for each j € [N]. 

Example 6: For example, in Figure [18] we show an en- 
hanced conflict graph for a 2 x 3 switch with unicasts and 
broadcasts only. There is an edge between uu and 612, since 
they both represent flows serving input 1. There also exists 
an edge between uu and U21 since they both serve output 
1; however U21 and bi2 are not adjacent since they do not 
conflict on the input nor the output side. We can observe that 
vertices uij for all j, representing unicast flows from input 
1, are adjacent to each other due to input side conflict. This 
statement holds for U2j for all j as well. Furthermore, we 
can observe that bij for all j, representing broadcast subflows 



from input 1, are not adjacent to each other since broadcast 
subflows from the same flow can be served simultaneously. 
Therefore, we can think of G2,n consisting of two induced 
complete subgraphs G2,n{Ui) and G2,n{U2) of size N and 
two induced stable sets G2,n{Bi) and G2,n{B2) of size N. 




b b 

13 23 

Fig. 18. 02,3 for a 2 X 3 switch with unicasts and broadcasts only 

Here, we note that the conflict graph of a if x iV multicast 
switch with unicasts and broadcasts can be relaxed to that 
of unicasts and a single multicast per input. This relaxation 
just removes vertices that represent broadcast subflows, which 
are not part of the multicast flow. Removing vertices from a 
graph cannot hurt the perfection of a graph. Therefore, any 
upper bound on the imperfection ratio of the conflict graph 
for unicasts and broadcasts bounds holds also for unicasts 
and a single multicast per input. For example. Figure [19] 
and [20] present two traffic patterns which relax the broadcast 
requirement of the traffic pattern shown in Figure [6] In Figure 
[T9] input 1 multicasts to only outputs 2 and 3; therefore, the 
node 611 in Figure [6] is removed here. In Figure [20] input 1 
multicasts to only outputs 1 and 2; therefore, the node 613 
in Figure [6] is removed here. The imperfection ratio of the 
enhanced conflict graph in Figure [19] remains the same as 
that of Figure [6] since an odd hole of size 5 is present in 
both. However, in Figure [20] the enhanced conflict graph is 
perfect, since it is a bipartite graph. This illustrates the fact that 
removing vertices from a graph cannot make it less perfect. 

2) Speedup of ^^^^ In this section, we give an upper 
bound on speedup for K x N switches. We present 2K — 1 
induced perfect subgraphs of Gk,n that cover all the vertices 



K times. Then, with Proposition [T] we have 



2K-1 



as an upper 
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Fig. 19. A traffic pattern and its enhanced conflict graph 




Fig. 20. A traffic pattern and its enhanced conflict graph 



This proves that Gj is perfect. ■ 

Using Lemmas |4] and |5] we derive our first upper bound 
on speedup in K x N muhicast switches with traffic patterns 
consisting of unicasts and broadcasts only. 

Proposition 3: imp{GK.N) < 

Proof: Consider the following collection of induced sub- 
graphs: K — 1 copies of G" from Lemma |4] and Gi from 
Lemma |5] for all i e [K]. We know that these subgraphs are 
all perfect. In addition, these subgraphs cover each vertex in 
V £ Gk,n K times. For v e Ui, Gi and each copy of G" 
covers v once. For v G Bi, each Gi covers v. By Proposition 
[T] the claim follows. ■ 

For example, in the 2x3 switch, G" and G2 for a 2 x 3 
switch is shown in Figure|2T| In this case, G", Gi (not shown) 
and G2 will together cover every vertex in G2,n exactly 3 
times. This implies an upper bound of 1.5 on the speedup. 



bound for speedup. 

Lemma 4: Let G" = GK,N{yiie[K]Ui) be an induced 
subgraph of Gk.n- Then G" is perfect. 

Proof: G" is an enhanced conflict graph for unicast 
traffic. One may check that G" is a line graph of a bipartite 
graph, which is known to be perfect ViQi . ■ 

Lemma |4] also follows from the result in ll27l which shows 
that 100% throughput can be achieved in a input-queued 
crossbar switch in the context of unicast traffic. 

Lemma 5: Let Gi — Gk,n U C/^) for some 

i G [K] be an induced subgraph of Gk,n- Then Gi is perfect. 

Proof: Assume that G,; is not perfect. So it must have an 
odd hole or odd anti-hole as an induced subgraph. Suppose 
it has an odd hole, say H. In G,;, any broadcast sub flow, 
except the ones from input i, has no conflict on the input 
side. Suppose such a subflow were part of H, then both its 
neighbors in H will be due to output side conflicts. But in that 
case, the two neighbors will themselves conflict at the output, 
thereby forming a triangle. Since an odd hole cannot contain 
a triangle, we conclude that H cannot include any hjk with 
j + «■ 

This means iJ must be an induced subgraph of GK,N{Bi U 
Ui). However, Bi induces a stable set, while Ui induces a 
clique. Therefore, GK,N{Bi U Ui) is a split graph, which is 
known to be perfect. This contradiction shows that Gi cannot 
contain an odd hole H. 

Suppose Gi contains an odd anti-hole. This will happen if 
and only if Gi contains an odd hole, say H' . Note that in Gi, 
two vertices are connected if the corresponding subflows do 
not conflict. Now, H' has to contain at least one unicast, say 
Uij. This is because the broadcasts by themselves induce a 
perfect subgraph in Gi, which is a complement of a disjoint 
union of complete graphs. Now, in Gi is adjacent to any 
bi'jr, where i i' and j ^ j'. Let bpq and bp'q' be vertices 
adjacent to Uij in H'. Then, using the definition of Gi, we can 
infer that i ^p ^ p' ^ i and q = q' ^ j. But this means, any 
vertex that is adjacent to bpq is also adjacent to bpiqi. Hence, 
H' cannot be an odd hole. 




Fig. 21. G" and G2 for a 2 X 3 switch with unicasts and broadcasts only 

3) Speedup of -^^: The proof idea in this section is similar 
to that of Section IV-D2I We present 2N induced perfect 
subgraphs of Gk,n that cover all the vertices + 1 times, and 
then appeal to Proposition [T] However, unlike Section IV-D2I 
here we change our focus from the input to output. 

Lemma 6: Let G° , = GK,N{Vi) where = U° U 
(Ujg[7v]i?°) be an induced subgraph of Gk,n- Then GJ j is 
perfect. 

Proof: Assume that G^ , is not perfect. So it must have 
an odd hole or odd anti-hole as an induced subgraph. Suppose 
it has an odd hole, say H. Since U° U B° forms a complete 
graph (known to be perfect), H must contain vertices of B°, 
j ^ i. Suppose bkj G B° is part of H, then H contains at 
least two vertices of B°. This is because, in G^ j, bkj has 
only one conflict on the input side; thus, neighbors of b^j are 
Uki (input conflict) and B° (output conflict). However, note 
that B° itself forms a complete graph, therefore H contains 
at most two vertices of B°. Thus, bkj and bk'j, k ^ k' are in 
H. Then, Uki and Uk'i are in H. However, these four vertices 
form a cycle, thus G^ ^ cannot contain an odd hole H. 

By the same argument as in the proof for Lemma |5] we can 
show that G° i cannot contain an odd anti-hole. This proves 
our claim. ■ 

Lemma 7: Let G^j = GK^iVi) where Vi = B° U 
{^je[N]U°) be an induced subgraph of Gk,n- Then, G2 i 
is perfect. 

Proof: G2 i is an enhanced conflict graph for unicast 
traffic in addition to all broadcast subflows to output i. 
Consider bu G B° and uu G U,jg[x] C^i- In a K x N switch, 
bii and uu represent subflows from input 1 to output i, and 
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TABLE I 

A COMPARISON OF THE FOUR SCHEMES IN A 2 X 3 SWITCH WITH ALL 
FLOWS 



Polytope 


Volume 




Normalized 


Speedup to 








Volume 


achieve Padm 


^adra 


4.921 X 10" 


-9 


1 


1 


^coding 


4.686 X 10" 


-9 


0.952 


1.25 


Pfs 


4.613 X 10" 


-9 


0.937 


1.25 


Pnofs 


2.260 X 10" 


-9 


0.460 


1.67 



TABLE II 

A COMPARISON OF THE FOUR SCHEMES IN A 4 X 3 SWITCH WITH 
UNICASTS AND BROADCASTS ONLY 



Polytope 


Volume 


Normalized 
Volume 


Speedup to 
achieve Padm. 


-Padm 


1.4546 X 10" 


9 


1 


1 


P 

coding 


1.4541 X IQ- 


9 


0.9997 


1.25 


Pfs 


1.4527 X 10" 


9 


0.9987 


1.25 


Pnofs 


1.0585 X 10" 


9 


0.7277 


1.67 



thus conflict with the same set of subflows, i.e., neighbors 
of uii are neighbors of bu. In addition, bu and uu are in 
conflict. Therefore, by Replication Lemma (Lemma [T]!, we 
know that G2,i is perfect if GK,N{Vi \ {bu}) is perfect. We 
can apply this argument repeatedly for each bji £ B°, and 
deduce that if GK,N{^je[N]Uj) perfect then j is perfect. 
Note that from Lemma |4] we know that the enhanced conflict 
graph G"* = GK,N{'Jie[K]Ui) = GK,N{^j(:[N]U°) forunicast 
traffic is perfect. Therefore, G2 i is perfect. ■ 

Now, using Lemmas |6] and I2I we can derive an upper bound 
for speedup in K x N multicast switches with traffic patterns 
consisting of unicasts and broadcasts only. 

Proposition 4: imp(G/f,jv) < 

Proof: Consider the following collection of induced sub- 
graphs: G° j and Gj j for all i G [N]. By Lemmas |6] and |7] we 
know that these subgraphs are all perfect. In addition, these 
subgraphs cover each vertex in u e Gk.n N + 1 times. By 
Proposition [T] the claim follows. ■ 

E. Numerical study of network coding benefits 

As noted in Section [III we know that network coding 
increases the throughput of networks in general. We now quan- 
tify the benefit of network coding by numerically computing 
the rate regions. However, as noted in Section lTlI-AI computing 
the rate region (which is equivalent to computing the stable 
set polytope of a conflict graph) is A^P-hard. As a result, we 
focus on the rate regions of 2 x 3 switch with all flows and 
4x3 switch with unicasts and broadcasts only. 

In a 2 X 3 switch, there are three unicasts, three two-casts 
and one broadcast from each of the two inputs. Therefore, 
the rate region is a 14-dimensional polytope, which allows 
numerical computation to be feasible. We computed the stable 
set polytope of the enhanced conflict graph corresponding 
to a 2 X 3 switch to obtain the different rate regions. The 
comparison is shown in Table U In a 4 x 3 switch with unicasts 
and broadcasts only, there are three unicasts and one broadcast 
from each of the four inputs. Therefore, the rate region is a 
16-dimensional polytope. We again computed and compared 
the different rate regions. The results are shown in Table UII 



In Table |T] and Table [III the rate regions are compared in 
term of the volume of the polytope and the minimum speedup 
needed to achieve 100% throughput. Here, Padm refers to 
the admissible region; Pcoding is tiie linear intra-flow network 
coding rate region; Pfs refers to the rate region with fanout- 
splitting only; and Pnofs is the rate region when fanout- 
splitting is not allowed. 

The methodology we used to compute these values was to 
list all the stable sets of the enhanced conflict graph using 
a greedy algorithm. Using these list of stable sets and a 
MATLAB packet called the multi -parametric toolbox f23l, we 
computed the stable set polytope in terms of linear inequalities 
which in tern gave us the rate region. Once we have an explicit 
description of the rate regions, we used a software package 
known as Vinci 1 1 1 to compute the volume of the rate regions. 
The rate region of the case with fanout splitting but no coding 
was obtained using the characterization given by Marsan et al. 
|j26l. The speedup required to achieve Padm is equal to the 
minimum factor needed to expand the polytope such that it 

covers Padm- 

It is interesting to note that the speedup needed to achieve 
100% throughput for Pcoding and Pfs is 1 .25 for 2 x 3 and 3x4 
switch. Furthermore, we verified that the traffic patterns that 
require speedup of L25 to achieve are variations of the traffic 
pattern shown in Figure [15] This seems to indicate that the 
"hardest" admissible traffic patterns to achieve are those that 
have an enhanced conflict graph with an odd hole of length 
5. This observation leads to our Conjecture |2] (presented in 
Section IV-Fl i that the actual minimum speedup required to 
achieve 100% throughput in a A' x switch with traffic 
patterns consisting of unicasts and broadcasts only is exactly 
5/4 when network coding is allowed. 

It may seem that the results in Table H] and HI] show 
that coding does not outperform fanout-splitting by much in 
terms of total achievable rate region. However, we should not 
interpret this result as such. Another way of looking at the 
two polytopes Pcoding and Pfs is to just compare these two 
directly. From our previous example in Figure [13] we know 
that Pfs C Pcoding- So, we Can ask what is the speedup 
needed for Pfs to achieve Pcoding- For our 2x3 switch and 
4x3 switch, the speedup we need for the fanout-splitting 
region to achieve the network coding region is 1.1667. The 
traffic patterns that require this speedup are variations of the 
traffic pattern shown in Figure [14] which requires a speedup 
of (1.5 - j^) = I K 1.1667 (where A^ = 3) when fanout- 
splitting is allowed as shown in Theorem [8] Therefore, this 
shows that network coding can give us a benefit equivalent to 
speedup of at least 1.1667. 

Interestingly, the speedup required for Pnofs to achieve 
100% throughput is 1 .67 in both 2x3 and 4x3 switch, and the 
traffic patterns that require this speedup are also the variations 
of the traffic pattern shown in Figure [14] These observations 
indicate that there may be a few traffic patterns that are "hard" 
to achieve, and focusing our analysis on these few traffic 
patterns may be enough to understand the performance of a 
scheme in general. If this is the case, the challenge lies in 
finding these few key traffic patterns, and network coding with 
its graph-theoretic interpretation gives us insights into which 



19 



traffic patterns might be of significance. 

F. A conjecture on the minimum speedup 

We have thus introduced a simple graph-theoretic bound on 
the speedup needed to achieve 100% throughput in a multicast 
network coding switch using the concept of conflict graphs. 
We have shown that the imperfection ratio of the enhanced 
conflict graph gives an upper bound on the speedup needed. 

Applying this result to the special case of K x N switches 
with unicasts and broadcasts only, we have obtained an upper 
bound on speedup of min( ^^^ , ■^^)- For a 2 x switch, 
the upper bound evaluates to 1.5. We showed earlier in this 
section that the speedup is lower bounded by 1.25 for any 
switch with 2 or more inputs and 3 or more outputs. We have 
verified using a computer that the actual speedup needed is 
1.25 for the case of 2 x 4, 2 x 5, 3 x 3 and 4x3 switches, 
which meets the lower bound. This seems to indicate that the 
enhanced conflict graph for the case of unicasts and broadcasts 
has a structure that fixes the imperfection ratio at 1.25. We will 
next present some results and conjectures on this structure. 

Consider a 2 x TV switch. We use the same notation as in 
Section Ed] Let / and O denote the set of inputs and outputs 
respectively. Let G denote the enhanced conflict graph. Then, 
the fractional stable set poly tope QSTAB{G) is given by the 
following inequalities: 

For J e [N], 
For j e [N], 
For j G [N], 

71 

Forje[iV], bi,+J2 
fc=i 



For j e [Nl 



■U2j 





> 





(10) 


blj 


> 





(11) 




> 





(12) 


Ulk 


< 


1 


(13) 


U2j 


< 


1 


(14) 




< 


1 


(15) 



Every stable set of G satisfies the clique conditions and is 
therefore a part of QSTAB{G). Now, QSTAB{G) is clearly 
inside the unit hypercube [0, l]'^^. Therefore, since the stable 
sets are 0-1 vectors, they cannot be expressed as a non-trivial 
convex combination of two distinct points in QSTAB{G). 
This means every stable set of G is an extreme point of 
QSTAB{G). The following theorem specifies some other 
extreme points. 

Theorem 10: The vectors v(m, U, V) of the following form 
are extreme points of Q STAB (G): 

Ulm = \U\~^ 

blj = I -\U\-^ for all j eV 
U2j = |C/r^ foralljeU 

where U runs over all subsets of O such that 2 < \U\ < 

{n — 1), and for a given U, m runs over all outputs in 0\U 

and V runs over all subsets of O such that V ^ U. The rates 

of all other subfiows are zero. ( See Figure \22h 

The proof is given in the appendix. Note that although the 

figure shows a case where m ^ V, in general, m could be in 

V. 




Fig. 22. Extreme point of QSTAB(G) 



Theorem 11: The fractional weighted chromatic number of 
G with the weight vector set equal to the point v(m, U, V) 
from Theorem\TO\is upper bounded by 1 + \U\^^ — \U\^'^ and 
hence is no larger than 1.25. 
The proof is given in the appendix. 

Conjecture 1: QSTAB{G) has no other extreme points 
besides the stable sets and the ones given in Theorem [TOl 

If this conjecture is true, then Theorem[TT]will imply that an 
expansion factor of 1.25 will enable STAB{G) to cover every 
vertex of QSTAB{G) for a 2 x iV switch with unicasts and 
broadcasts only. Based on our simulations, we believe that this 
approach will in fact extend Xo ?l K x N switch with unicasts 
and broadcasts. This leads to the following conjecture. 

Conjecture 2: The minimum speedup required to achieve 
100% throughput in a K x N switch with traffic patterns 
consisting of unicasts and broadcasts only is exactly 1.25. 

If true, this conjecture shows that the "worst" traffic pattern 
induces an enhanced conflict graph that contains an odd hole 
of size 5 (for example. Figure [TsT l. 

In summary, by allowing network coding in multicast 
switches, we derive not only a graph-theoretic characterization 
of the speedup needed for 100% throughput, but also a gain 
in throughput and speedup. We have shown that network 
coding, which is usually implemented using software, can 
substitute speedup, which is often achieved by adding extra 
switch fabrics. 

VI. Algorithms for offline and online scheduling 

In this section, we propose offline and online scheduling 
algorithms to achieve the rate region of network coding 
multicast switches. We first start with the offline algorithm in 
Section IVI-AI then discuss the maximum weighted stable set 
(MWSS) online algorithm in Section IVI-Bl and its refinement 
in Section IVI-CI In Section IVI-DI we study the effect of 
network coding in an online setting via simulations. 

A. Rate decomposition approach for offline scheduling 

The proof of Theorem |6] gave an offline scheduling strategy 
which used the fact that as long as the enhanced rate vector is 
within the stable set polytope of the enhanced conflict graph, it 
can be decomposed into a convex combination of valid switch 
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configurations. Thus, prior knowledge of the average arrival 
rates of the flows can be used to obtain a schedule. In this 
subsection, we focus on this problem of rate decomposition 
for offline computation of the schedule in a manner similar 
to the Birkhoff-von Neumann switch for unicast fS]. The 
following discussion gives a graph-theoretic interpretation of 
this problem. 

Recall that the fractional weighted coloring problem in- 
volves decomposing a vector of weights on the vertices of 
a graph into a linear combination of stable sets (see Section 
III-Dl l. We interpret the weights to correspond to the flow rates, 
and the coefficients Xt used in the linear combination to be 
the fractions of time in the schedule. If the fractional weighted 
chromatic number is less than 1, then the optimal solution 
expresses the weight vector as a convex combination of stable 
sets, which in turn leads to a switch schedule. This leads to 
the following theorem. 

Theorem 12: The problem of computing the offline switch 
schedule for a multicast truffle pattern when fanout splitting 
and intra-fiow linear network coding are allowed, is equivalent 
to the problem of fractional weighted coloring of the enhanced 
conflict graph, with the enhanced rates used as vertex weights. 

If the fractional weighted chromatic number c for a given 
rate vector exceeds 1, then such a rate vector cannot be 
achieved, since it is not within the stable set polytope. 
However, if the rate vector is admissible, then it can be 
achieved if we allow a speedup equal to c. This leads to an 
interesting physical interpretation for the fractional weighted 
chromatic number corresponding to a given admissible rate 
vector, summarized in the following theorem. 

Theorem 13: The minimum speedup needed to achieve an 
admissible rate vector with fanout splitting and coding, is the 
fractional weighted chromatic number of the enhanced conflict 
graph, with the enhanced rate vector used as vertex weights. 

Proof: Let c be the fractional weighted chromatic number 
of a given rate vector A speedup of s means that the switch 
can go through s configurations per time-slot. Thus, from the 
point of view of the input queues, their service rate is now s 
times higher Equivalently, if we redefine a slot to be the time 
spent by the switch in each configuration, then the speedup 
essentially scales down the arrival rate vector seen by the input 
queues by a factor of s. This means, for a given rate vector, 
the input queues can be stabilized with speedup s if the same 
rate vector, when scaled down by a factor of s, is achievable 
without any speedup. However, scaling down the rate vector 
by s will also scale down the optimum value of the fractional 
weighted coloring problem by the same factor, i.e., it will 
now be c/s. Thus, if c < s, then the new scaled rate vector 
is achievable without speedup. Therefore, for the given traffic 
pattern, the input queues can be stabilized with the speedup. 

But this is only from the point of view of the input queues. 
Note that with speedup, there are queues at the outputs of the 
switch as well and achievability means stabilizing both the 
input and output queues. Now, stability of the output queues 
only requires that the net inflow into an output queue must 
not exceed 1, which is part of the admissibility conditions. 
Since the traffic pattern is given to be admissible, we get the 
required result. ■ 



Algorithm: Max Weighted Stable Set (MWSS) 

1. Using qijj(t) as the weight for the vertex corresponding to the 
subflow (i,J,j), compute the maximum weighted stable set in 
the enhanced conflict graph. This specifies the set of subflows 
that will be served in the current time-slot. If qijj is for any 
chosen subflow, it is dropped from the set. 

2. For every flow in the chosen set, compute a linear combination of 
all packets received for that flow until time t, such that, the linear 
combination is innovative for all the chosen outputs of that flow. 
(It will be proved below that this is always possible.) 

3. Transfer the computed linear combination to the outputs of the 
subflows chosen in the stable set in step 1, and update Qijjit) 
accordingly. Go back to step 1. 

This switch schedule and the network code which ensures 
that every transmission conveys an innovative packet, together 
give a complete specification of a frame-based scheme that 
achieves the entire rate region, as shown in Section IIV-BI 

B. Maximum weighted stable set algorithm for online schedul- 
ing 

Suppose the rates of the various flows are unknown and 
scheduling has to be done online using only the current 
queue occupancy information. Analogous to the maximum 
weighted matching algorithm for unicast [|27l , we show that for 
multicast switches with fanout splitting and network coding, 
a maximum weighted stable set (MWSS) algorithm on the 
enhanced conflict graph achieves the same rate region as is 
achievable with prior knowledge of rates. In this section, we 
assume that the arrivals to each flow are i.i.d. and independent 
across flows. In this section, we first examine the conditions 
under which the virtual queues can be served in a stable 
manner In the next section, we will show that this stability 
can also be extended to the physical queues. 

Let qijj{t) denote the occupancy of the virtual queue for 
subflow {i,J,j) in time-slot t. Thus, qijj{t) is a measure of 
the backlog for subflow {i,J,j) in terms of the degrees of 
freedom. The MWSS algorithm uses these virtual queue sizes 
as weights to compute the maximum weighted stable set. We 
now present the algorithm. 

Lemma 8: Let V be a vector space with dimension n over 
a field of size q, and let Vi , V2, ■ • . Vk, be subspaces of V, of 
dimensions ni,n2, . . . ,nk respectively. Suppose that n > rii 
for all i = 1,2, ... ,k. Then, there exists a vector that is in V 
but is not in any of the Vi 's, if q > k. 

Proof: The total number of vectors in V is g". The 
number of vectors in Vi is q"' . Hence, the number of vectors 
in U*L]^Vi is at most X]i=i 9"'- Now, 

where, n,nax is max^ rii, which is at most [n — 1). Thus, V 
has more vectors than U^Lj^V^. This completes the proof. ■ 
Remark 1: For the above algorithm to work, we need to 
show that in the second step, there is at least one linear 
combination which is guaranteed to be innovative to all chosen 
outputs. Now, qijj gives the difference between the dimension 
of the knowledge space of input i and that of the output j for 
flow {i,J). Hence, if qijj is positive for a set of outputs, then 
we have the same situation as in Lemma[8] The k subspaces in 
the lemma correspond to the knowledge spaces of the outputs, 
while n is the dimension of the overall knowledge space of 
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the input. Thus, the lemma guarantees that there exists a linear 
combination of the packets of flow (i, J) that is innovative to 
all those outputs, as long as the field size is larger than the 
number of outputs involved. Such a combination is chosen in 
step 2. 

Note that while this argument shows the existence of such 
a linear combination, it does not give an explicit way to find 
one. However, since the scheduling and coding is done in a 
centralized manner, the encoder knows the outputs' knowledge 
spaces completely. Hence, the algorithm proposed in ll35l can 
be used to compute the required linear combination. 

Theorem 14: If the arrivals are i.i.d. and independent 
across flows and the rate vector is inside R (the rate region in 
Corollary\3^, then the MWSS algorithm given above, stabilizes 
the virtual queue size vector q in the mean. 

Proof: The proof is essentially an application of the 
results of |37| and |36| for the case of parallel queues. 
Consider the virtual queues as a system of parallel queues. 

It is clear that two virtual queues which conflict with each 
other cannot be served at the same time. Lemma [8] and the 
remark above imply that the converse is also true, i.e., any set 
of non-empty virtual queues which have no conflicts can be 
served simultaneously. The virtual queues corresponding to the 
chosen stable set will all receive one unit of service, since the 
reception of an innovative packet by the output will decrease 
the difference in dimension of the knowledge space between 
the input and output by 1. Using the terminology in [37|, this 
means that eligible activation vectors of the queues therefore 
correspond to conflict-free sets of subflows, or in other words, 
stable sets in the enhanced conflict graph. 

The only difference between this situation and the one 
assumed in f3T| is that |37| assumes that arrivals to differ- 
ent queues are independent of each other, whereas in our 
case, arrivals to subflows of the same flow always occur 
simultaneously. However, this lack of independence across 
arrival processes does not affect the results of [32], essentially 
because of the linearity of expectation of dependent random 
variables. Stability in the mean still holds, as long as other 
assumptions such as the ergodicity of the arrival processes and 
the finiteness of their second moment hold. Thus, the MWSS 
algorithm stabilizes the occupancy of the virtual queues (q), 
as long as their arrival rates are inside the convex hull of the 
eligible activation vectors, which is the stable set polytope of 
the enhanced conflict graph. In other words, as long as the 
arrival rate vector is within R, Wmt^ao E[qijj{t)] < oo V 
subflows {i,J,j). ■ 

C. Stabilizing the physical queues 

In this section, we will connect the virtual queue size to 
the physical queue size. The above theorem shows that the 
MWSS algorithm stabilizes the backlog in number of degrees 
of freedom. However, the algorithm does not specify any rule 
for packets to depart from the physical buffer at the inputs. 

We propose a departure rule based on the following obser- 
vation. If a packet has been decoded by all the outputs in the 
fanout of the packet's flow, then involving such a packet in 
the transmitted linear combination does not convey anything 



new that could not have been conveyed without involving this 
packet. This naturally implies the following departure rule - 
a packet departs from the physical queue when it has been 
decoded by all outputs in the fanout of the packet's flow. 

To understand the queue dynamics under this departure rule, 
we need to specify the decoding mechanism. We assume a 
centralized system where the output knows the coefficients 
used by the input in the linear combinations. The output 
can verify if a packet is innovative by checking whether its 
coefficient vector is in the output's knowledge space. Each 
innovative packet counts as a new degree of freedom and 
provides a new equation in the packets. With enough degrees 
of freedom, the output simply needs to invert the matrix of 
coefficients to obtain the original packets. Thus, if the backlog 
becomes (i.e., the virtual queue becomes empty), the number 
of equations becomes equal to the number of unknowns and 
the output can completely decode all the packets till that point. 

Consider a slot in which all virtual queues of a flow become 
empty simultaneously. At this point, the physical queue for that 
flow will also be empty, because at this point, all outputs in 
the flow's fanout will have reached the state of having decoded 
all packets. 

Now, the fact that virtual queues are stable in the mean 
implies that the underlying Markov chain is positive recurrent. 
Thus, the chain will visit the state where all virtual queues are 
empty, infinitely often with probability one. The connection 
between the emptying of the virtual queues of a flow and the 
physical queue of that flow implies that the physical queue 
will also become empty infinitely often almost surely. This 
conclusion is summarized by the following corollary. 

Corollary 6: If the arrivals are i.i.d. and independent 
across flows and the mean arrival rate vector is strictly inside 
the rate region T, then the strategy of allowing a packet to 
depart when it is decoded by all outputs in its fanout ensures 
that the physical reaches a the empty state inflnitely often, with 
probability (w.p.) 1. 

Remark 2: The strategy of dropping a packet when it has 
been decoded by all outputs in its fanout, can be improved 
using a streaming policy for buffer clearance, as proposed in 
[35 1 . This work introduces the notion of a node "seeing" a 
packet. Using that notion, packets of a flow that have been 
seen by all outputs in the flow's fanout can be dropped from 
the queue. When combined with the coding module proposed 
in ll35l . this will allow the physical buffer size to follow 
the virtual queue sizes without compromising on throughput. 
Thus, the stability results of the virtual queue readily carry 
over to the physical queue as well. This approach allows to 
prove stability of the physical queue in the mean, which is 
stronger than the positive recurrence claimed in Corollary |6] 

Remark 3: The results in ifTTl are related to our approach. 
In that paper, the authors analyze the performance of a back- 
pressure based policy for wired and wireless networks with 
intra-flow coding, and show that it stabilizes the system 
for all rate vectors within the capacity region. The network 
constraints are captured in terms of capacities on each link, 
which could be inter-dependent in the wireless setting. The 
crossbar switch, studied in our paper, is similar to the wireless 
setting in the sense that, an input may not send a different 
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packet to different outputs simultaneously. Besides, there is a 
special kind of inter-dependence among the links in that, of all 
links going to the same output, at most one may be active at 
a time. However, ifTTI gives an indirect characterization of the 
rate region in terms of certain flow variables, unlike the more 
explicit graph-theoretic characterization we have provided. 

D. Simulations: Improvement in delay 

In this section, we study how network coding, even if it 
does not improve the rate, can decrease delay dramatically. 
We study the effect of coding in an online setting, through 
MATLAB simulations in a 2 x 4, 3 x 3, and 4x3 switches. 
The setup we use is similar to the MWSS algorithm, which 
stabilizes the virtual queues as well as the physical queues 
(Section NT-B\ and Section lyTCl l. We modify the MWSS 
algorithm in two ways for the simulation. 

First, we use a batching-version of the MWSS algorithm 
- packets are grouped into batches according to their arrival 
times. The batch length is denoted as Aq — A(l + e), and 
all arrivals from time fcAp to (fc + l)Ao are said to belong 
to batch k. The basic idea is to run MWSS on one batch for 
A slots, then take a break to clear the backlog for that batch 
during the following eA slots, thereby allowing the outputs to 
decode it completely. After that, the batch is flushed out of the 
input buffers, and then, we begin afresh with the next batch. 
These breaks will cause a loss in throughput, since the MWSS 
algorithm is now running for only a fraction of the time. 
However, with a large enough batch length, this throughput 
loss can be made arbitrarily small. 

Second, instead of the maximum weight stable set which is 
known to be A^P-hard |fT9l , we use a simpler randomized 
algorithm using an idea proposed in [36|. Reference [36] 
proposes a scheduling approach that leads to policies with 
maximum throughput and yet linear complexity per packet 
transmission for a resource allocation problem for several com- 
puter and communication network architecture. The proposed 
policy is a randomized, iterative algorithm with a combination 
with an incremental updating rule. Although there is no 
guarantee that at any time the configuration used is optimal, the 
policy approximates the optimal policy such that it provides 
maximum throughput. 

Our randomized algorithm approximates the MWSS algo- 
rithm. In each slot, we randomly generate a constant number 
of maximal stable sets. Given the current backlog, we compute 
the weight of all the randomly generated maximal stable sets 
as well as the stable set that was used in the previous slot. 
Then, we select the maximum weight stable set and use it as 
the configuration of the current slot. 

We compare the performance with the case of fanout split- 
ting without coding. For this case, we use a similar randomized 
modification of the algorithm given in |26|. Instead of stable 
sets, we use the modified departure vectors defined in [26|. 

In Figure |23] we study the performance of a 2 x 4 switch 
with and without coding. For this simulation, the parameters 
A and e in the finite horizon MWSS algorithm were set to be 
3000 and 0.005. The traffic pattern used here is identical to that 
of in Figure [14] with iV — 4. Arrivals are generated according 
to an i.i.d. Bernoulli process independently for each flow. 
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Fig. 23. Delay vs. load plots with and without coding in a 2 X 4 switch 



In Section lV^ we discussed that the traffic pattern in Figure 
[T4l is achievable when network coding is allowed while it is 
not if we only allow fanout-splitting, which is reflected in the 
plot in Figure |23] At light loads, the algorithm using coding 
incurs a larger delay due to coding and decoding costs. When 
the traffic is light, inputs of the uncoded scheme just relay 
the packets to the outputs; however, in the coded scheme, 
outputs need to wait until they have received enough packets 
to decode the entire batch. As a result, we see that there is 
a consistent delay of approximately 1500 slots for the coded 
scheme at light loads. It is important to note that the delay of 
1500 slots is not an arbitrary delay, but a parameter we can 
choose depending on our application. The delay is the average 
slots each packet has to wait until it is decoded at the output - 
and since our batch size A is 3000, the average delay is 1500. 

The interesting part of our result is when the load is heavier. 
First we note that, for the uncoded scheme, the delay increases 
dramatically at a lower value of load {a « 0.8), as opposed to 
the coded scheme {a « 1). Thus, in terms of throughput, the 
coded scheme is better. This empirically shows that network 
coding increases the rate region. Here, we note the significance 
of the two boundary values: a « 0.8 and a ~ 1. First, as 
mentioned above, the traffic pattern in Figure [14] is achievable 
with coding; thus, as we expect, coding does not incur heavy 
delays until a « 1. Second, Theorem [8] in Section [V-EI prove 
that a speedup of at least (1.5 — -^) is needed to achieve 100% 
throughput with fanout-splitting only. In this example, we have 

= 4; therefore, we need speedup of at least 1.5 ^ \ — ji 
which is reciprocal of a k, 0.8. 

In Figure [24] and Figure [25] we study the performance of 
a 3 X 3 and a 4 x 3 switch with and without coding. For 
this simulation, the parameters A and e in the finite horizon 
MWSS algorithm were set to be 1000 and 0.005. In these two 
simulations, we use a more general traffic pattern which is a 
combination of the example pattern in Figure [4] weighted by 
a factor of |q;, and a pattern with all uniform unicasts, each 
having a rate of O.Ola, where a represents the load factor. 
Therefore, the traffic pattern for Figure [24] consists of one 
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Fig. 24. Delay vs. load plots with and without coding in a 3 X 3 switch 
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Fig. 25. Delay vs. load plots with and without coding in a 4 X 3 switch 

broadcast from input 1, with a rate of |a. There are three 
unicasts, one to each output, from inputs 1 and 3, each having 
a rate of O.Ola. From input 2, there is a unicast of rate (| + 
0.01)a. The traffic pattern for Figure|25]is identical to that of 
Figure |24] with additional three unicasts, one to each output, 
from input 4 with a rate of O.Ola each. 

Figure |24] and Figure |25] show the plot of delay vs. load for 
the randomized algorithm with and without coding in a 3 x 3 
and a 4 X 3 switch. As mentioned above, at light loads, the 
algorithm using coding incurs a larger delay of approximately 
500 due to coding and decoding costs, which is consistent with 
the parameters we have chosen (A = 1000). 

Again, network coding shows its strength when the load is 
heavier In both simulations in Figure|24]and Figure|25] we see 
an increase in throughput. The difference in a in which the 
delay increases dramatically for coding and fanout-splitting 
is approximately 0.2 for both simulations. This empirically 
shows that network coding increases the rate region. Equiva- 
lently, network coding leads to delay benefits at high loads. 
We can consider the load beyond a = 1.4 in Figure |25] for 



instance. Here, the traffic load is outside of the rate region 
for with and without network coding. Therefore, we would 
expect the delay for both coded and uncoded schemes to 
surge up. The part that interests us is the significant difference 
in delay between the two schemes. This shows that under 
heavy traffic, network coding is robust and, although the traffic 
pattern is beyond its rate region, it delivers the packets with 
much smaller delay than the uncoded scheme even when we 
take the coding and decoding cost into account. 

vn. Conclusion 

In this paper, we explore some issues regarding the benefit 
of network coding in multicast switch in terms of throughput, 
delay, and speedup. Although network coding includes coding 
schemes with any arbitrary functions, we focus our attention to 
linear network coding. This is because linear network coding 
is sufficient to achieve capacity in a multicast switch, and 
it gives us simplicity in code. We show that allowing linear 
intra-flow network coding at the inputs leads to a larger rate 
region in general. We demonstrate examples of traffic patterns 
where coding eliminates the need for speedup to serve the 
traffic in a stable manner In addition, using linear network 
coding allows us to use a graph-theoretic formulation called 
the conflict graph from |33 |, which is an insightful formulation 
that brings the problem to its combinatorial essence. 

In summary, by allowing network coding in multicast 
switches, we get not only a characterization of the speedup 
needed for 100% throughput, but also a gain in throughput, 
delay, and speedup. We have shown that network coding, 
which is usually implemented using software, can substitute 
speedup, which is often achieved by adding extra switch fab- 
rics. This paper presents a graph-theoretic approach to quantify 
the minimum speedup needed to achieve 100% throughput. 
This new formulation helps us better understand the problem 
and enables us to use combinatorial and graph-theoretic results 
to measure the benefit of network coding in switches. 

Possible future work could be to use this formulation to 
come up with approximation schemes and heuristics that 
simplify the online scheduling algorithm and make it practical. 
Furthermore, studying the benefit of inter-flow coding, which 
was mentioned briefly in Section ITI-C2I using a similar graph- 
theoretic approach could lead to interesting results. 
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Appendix 

Proof of Theorem [8] 

Proof: We will need to show that these inequalities are 
necessary and sufficient for a rate point in the rate region. 

Necessity: Equation |5] is the non-negativity constraint; 
Equation |6] and Equation |7] represent the admissibility condi- 
tions for input 2 and each output i. Therefore, these equations 
are necessary conditions for the achievable rate region. We 
now need to show that Equation [8] is also necessary. 
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Consider any point r inside the rate region. There is a frame- 
based schedule with frame size F such that, if input 1 has tqF 
packets for the broadcast flow and input 2 has r^F packets for 
the unicast flow to output i at the beginning of a frame, then 
by the end of that frame, both inputs will be able to deliver 
all these packets to the corresponding outputs. (Note: In this 
proof, we assume without loss of generaUty that all rates are 
rational numbers, and that is a large enough integer such 
that r^F, r.iF etc. are all integers.) 

Let X be the number of slots in the schedule in which input 
2 is not transmitting. Input 1 can deliver x packets of the 
broadcast flow to all outputs in this time. In the remaining 
{F — x) slots, input 1 requires at least two slots to deliver 
one packet to all outputs, since in any slot, at least one of the 
outputs is blocked by input 2. Hence, the number of packets 
delivered in this time is at most ^ ^ . Therefore, in order to 
satisfy the requirement, we need: 

(F ~x) 
X + ^ ^ ' > roF 

As for input 2, it has {F — x) slots to process all the unicast 
packets. Thus, to satisfy the unicasts, we need: 

F-x>(^j}jF 

Eliminating x between the above two conditions and canceling 
F throughout gives Equation [8] 

Sufficiency: To show that Equations |5] through |8] are 
sufficient, we will provide an explicit construction for a 
frame-based schedule that achieves any rate point r inside 
the given polytope. We will show that if we start with a 
collection of i^F packets for flow i, then all these packets 
can be correctly delivered to their destined outputs within 
F slots. For convenience, we denote S := X^iLi ^i- Define 
a := min(f 

Partition the tqF packets of the broadcast flow into + 1 
groups in the following way. Place the first ariF packets in 
the group 1. Place the next ar2F packets in group 2, and 
so on. After filling group N with ar^F packets, place the 
remaining packets (if any) in group {N +1). {Note: We can 
choose a large enough F such that ariF is an integer for all 
i.) 




Phase Phase 1 Phase N Phase (N+1) 

(tq - aS)F ariF arj^F mftx{(l - o)S'i-'. max,. r.i- } 



Fig. 26. The schedule for proof of Theorem [8] 

The schedule consists of N+2 phases, numbered to 
as shown in Figure |26l In phase 0, serve packets in group 



+ 1 (if any) using a broadcast connection to all outputs. 
This requires (rg — aS)F slots. 

In phase 1 which lasts for the next ariF slots, serve a 
fraction of the unicast from input 2 to output 1 . Simultaneously 
transmit all group 1 packets from input 1 to all outputs except 
output 1. Similarly, in phase 2 (the next ar2F slots), serve 
a fraction of the unicast to output 2 along with broadcasting 
group 2 to other outputs, and so on. Phases 1 to TV require 
aSF slots to complete. 

At the end of phase N, exactly a fraction of all unicasts 
have been served. In addition, each output i has also received 
the broadcast flow packets from all groups except group i. But 
this means that each broadcast flow packet now needs to reach 
only one output. Thus, we essentially have a unicast problem, 
since no two outputs want the same packet. These "unicasts" 
are served in phase + 1 along with the unicasts from input 
2. We now need to bound the duration of this phase. From the 
unicast switch literature, it is well known that the minimum 
number of slots to serve out all the unicasts is equal to the 
maximum load in terms of number of packets at any of the 
input or output ports. This follows from a theorem of Birkhoff 
ID, and is stated in Fact 1 of ID. 

The load on input 1 is simply the sum of the sizes of the 
first N groups, which is aSF. The load on input 2 from the 
remainder of the unicasts is (1 — a)SF packets. Since a < ^, 
input 2's load dominates input I's load. Consider output i. The 
load on this output from input 1 is the size of group i, which 
is ariF. From input 2, it is (1 — a)riF. Thus, the total load 
on output i is riF. This means, the duration of phase + 1 
is max{(l — a)SF, max^ riF}. 

Summing over all the phases, the total duration of the 
schedule is: rpF + max{(l — a)SF, max^ riF}. 

If < 5, this gives roF + max{(5 — ro)i^, max^ r^F}. 
From Equations |6] and [T] this is at most F. 

If ^ > i, the duration is roi^+max{^, max^ riF}. From 
Equations |7] and [8] this is at most F. 

Thus, we have presented a schedule which serves all the 
packets within F slots. ■ 

Proof of Theorem \W\ 

Proof: We make use of the following fact about polytopes 
(see Theorem 5.7 in |31|). A vector z is an extreme point of 
a polytope of the form P = {x e M"|y4x < b} if and only if 
the sub-matrix of A obtained by including only those rows of 
A corresponding to constraints that are tight at z, has a rank 
of n. 

Now, QSTAB{G) C M?^ . This means, for the vector in the 
theorem statement, we need to find 2>N linearly independent 
constraints that are satisfied with equality, for every allowed 
choice of U, V and to. 

Choose U, V and m in any way subject to the restrictions 
in the theorem statement. Consider the following constraints 
of QSTAB{G): (O for j 7^ to, O for j V, ^ for 
j i. U, (US) for j e V, (E) for j e f/ and ([III. There are 
3A^ constraints in this list. We set all of them to equality and 
try to solve the resulting system of equations: 
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TABLE III 

The decomposition of the vector in TheoremIToIinto stable sets 





Stable set (Si) 


A. 


1 


For each j a U: uim, U2j 






2 


For each j £ U: U2j with fen- for 


|C/|-i- 






all k e V\{j} 






3 


bif; for k G V 







ic/r 



with ^2' 




/varies over U 



j varies over U 



For j ^ 


TO, Ulj 


= 


(16) 




For j ^ 


V, 


= 


(17) 


[2] 


For j ^ 


U, U2j 

n 


= 


(18) 


For j e F, 6ij 




= 1 


(19) 


[3] 




n 


= 1 


(20) 


[4] 










[5] 


For j e [/, uij + 


+ 


= 1 


(21) 





Now, ( fT9] l implies that the 6ij's are all equal for j e V. 
Let the common value be b. Using ( fT6b . ( [TtI i and ( fTSl l. and 
the facts m ^ U and U C V, the last three sets of equations 
can be simplified to: 



b + Uim 

For j e U, U2j + b 



(22) 
(23) 

(24) 



Now, it is easily seen that this system of equations has a 
unique solution, and this solution is precisely the rate point 
v(to, U, V) given in the theorem statement. 

Thus, we have produced 3A^ constraints that are tight at the 
given point. The fact that the solution is unique implies that 
the 3N constraints considered are hnearly independent. This 
completes the proof. ■ 

Proof of Theorem [TT] 

Proof: To prove this result, we express the weight 
vector as a Unear combination of stable sets {Si} of G: 
Eti ^^X^' = v(m, U, V) such that ^^i A. - 1 + \U\-^ - 
(X^ denotes the incidence vector of the stable set S). 

The stable set and the corresponding A is shown in Table Hill 
An example of this collection of stable sets and the associated 
weights is shown in Figure|27]in the form of a switch schedule. 
The figure corresponds to the case where m — 1 and U — V — 
{2,...N}. 

It is easily seen that each set given in the table is indeed a 
stable set. Moreover, the sum of the coefficients Ai is indeed 
l + |f7|^^ — |C/|^^. And finally, the linear combination of the 
stable sets with the prescribed coefficients gives the rate vector 
v(to, U, V), thus completing the proof. ■ 
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